Machine Learning Engineer Nanodegree

Deep Learning

Project: Build a Digit Recognition Program

In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with 'Implementation' in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with 'Optional' in the header.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.


Step 1: Design and Test a Model Architecture

Design and implement a deep learning model that learns to recognize sequences of digits. Train the model using synthetic data generated by concatenating character images from notMNIST or MNIST. To produce a synthetic sequence of digits for testing, you can for example limit yourself to sequences up to five digits, and use five classifiers on top of your deep network. You would have to incorporate an additional ‘blank’ character to account for shorter number sequences.

There are various aspects to consider when thinking about this problem:

  • Your model can be derived from a deep neural net or a convolutional network.
  • You could experiment sharing or not the weights between the softmax classifiers.
  • You can also use a recurrent network in your deep neural net to replace the classification layers and directly emit the sequence of digits one-at-a-time.

Here is an example of a published baseline model on this problem. (video)

Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [1]:
import os
import sys
from six.moves.urllib.request import urlretrieve
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import idx2numpy
import time
from scipy.io import loadmat
In [2]:
t1 = time.clock()
train_and_valid_dataset = idx2numpy.convert_from_file('Data/train-images-idx3-ubyte')
test_dataset = idx2numpy.convert_from_file('Data/t10k-images-idx3-ubyte')
train_and_valid_labels = idx2numpy.convert_from_file('Data/train-labels-idx1-ubyte')
test_labels = idx2numpy.convert_from_file('Data/t10k-labels-idx1-ubyte')
t2 = time.clock()
print('Complete in %.2f seconds.'%(t2-t1))
Complete in 4.75 seconds.
In [6]:
# Arrange training, validation, and testing sets

train_dataset = train_and_valid_dataset[:50000,:]
valid_dataset = train_and_valid_dataset[50000:,:]

train_labels = train_and_valid_labels[:50000]
valid_labels = train_and_valid_labels[50000:]

print('Training set size: {}'.format(train_dataset.shape))
print('Validation set size: {}'.format(valid_dataset.shape))
print('Testing set size: {}'.format(test_dataset.shape))

print('Training label size: {}'.format(train_labels.shape))
print('Validation label size: {}'.format(valid_labels.shape))
print('Testing label size: {}'.format(test_labels.shape))
Training set size: (50000, 28, 28)
Validation set size: (10000, 28, 28)
Testing set size: (10000, 28, 28)
Training label size: (50000,)
Validation label size: (10000,)
Testing label size: (10000,)
In [7]:
def create_sets(data, labels):
    newdata = np.ndarray(shape=(data.shape[0], data.shape[1], 5*data.shape[2]))
    newlabels = np.ndarray(shape=(data.shape[0], 5))
    blank_digit = np.zeros(shape=(28,28))
    blank_labels = np.array([10,10,10,10,10])
    
    data_index = 0
    newdata_index = 0
    while newdata_index < data.shape[0]:
        # length of sequence
        length = np.random.randint(1,5)
        
        # images of digits for sequence
        data_digits = []
        for i in range(length):
            try:
                data_digits.append(data[data_index+i,:,:])
            except:
                print(data_index, i)
        
        # labels for sequence
        label_digits = np.empty(shape=(5))
        label_digits[0:length] = labels[data_index:data_index+length]
        label_digits[length:] = blank_labels[length:]
        
        # format the data
        temp_digits = []
        for i in range(5):
            if i < length:
                temp_digits.append(data_digits[i])
            else:
                temp_digits.append(blank_digit)
        
        sequence = np.concatenate(temp_digits, axis=1)
        
        # append the formatted data to the new arrays
        newdata[newdata_index,:,:] = sequence
        newlabels[newdata_index] = label_digits
        
        # update indexes
        data_index += length
        data_index %= (data.shape[0]-5)
        newdata_index += 1
    
    return newdata, newlabels
        
In [8]:
train_dataset, train_labels = create_sets(train_dataset, train_labels)
valid_dataset, valid_labels = create_sets(valid_dataset, valid_labels)
test_dataset, test_labels = create_sets(test_dataset, test_labels)
In [10]:
num_channels = 1 # greyscale
image_size = 28
num_labels = 11
seq_length = 5

def reformat_conv(dataset, labels):
  dataset = dataset.reshape(
    (-1, image_size, seq_length*image_size, num_channels)).astype(np.float32)
  #labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)
  return dataset, labels

train_dataset, train_labels = reformat_conv(train_dataset, train_labels)
valid_dataset, valid_labels = reformat_conv(valid_dataset, valid_labels)
test_dataset, test_labels = reformat_conv(test_dataset, test_labels)
print('Training set', train_dataset.shape, train_labels.shape)
print('Validation set', valid_dataset.shape, valid_labels.shape)
print('Test set', test_dataset.shape, test_labels.shape)
Training set (50000, 28, 140, 1) (50000, 5)
Validation set (10000, 28, 140, 1) (10000, 5)
Test set (10000, 28, 140, 1) (10000, 5)
In [11]:
# TRIM VALID AND TEST SETS TO AVOID CRASHING JUPYTER
valid_dataset = valid_dataset[:2000]
valid_labels = valid_labels[:2000]
test_dataset = test_dataset[:2000]
test_labels = test_labels[:2000]
In [12]:
def elementwise_accuracy(predictions, labels):
    preds = np.transpose(np.argmax(predictions, 2))
    return 100*np.sum(preds == labels)/len(preds)/seq_length
In [13]:
def accuracy(predictions, labels):
    total = 0
    preds = np.transpose(np.argmax(predictions, 2))
    for i in range(len(preds)):
        if sum(preds[i] == labels[i]) == 5:
            total += 1
    return 100*total/len(preds)
In [14]:
batch_size = 32
patch_size = 5
depth1 = 32
depth2 = 64
num_hidden = 128
beta = 8e-3

graph = tf.Graph()

with graph.as_default():

    # Input data.
    tf_train_dataset = tf.placeholder(
        tf.float32, shape=(batch_size, image_size, image_size*seq_length, num_channels))
    tf_train_labels = tf.placeholder(tf.int32, shape=(batch_size, seq_length))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # Variables
    layer1_weights = tf.get_variable("ConvW1", shape=[patch_size, patch_size, 
        num_channels, depth1], initializer=tf.contrib.layers.xavier_initializer())
    layer1_bias = tf.Variable(tf.zeros([depth1]), name="ConvB1")
    layer2_weights = tf.get_variable("ConvW2", shape=[patch_size, patch_size,
        depth1, depth2], initializer=tf.contrib.layers.xavier_initializer())
    layer2_bias = tf.Variable(tf.zeros([depth2]), name="ConvB2")
    layer3_weights = tf.get_variable("FcW1", shape=[image_size // 4 * image_size // 4 * depth2, 
        num_hidden], initializer=tf.contrib.layers.xavier_initializer())
    layer3_bias = tf.Variable(tf.zeros([num_hidden], name="FcB1"))
    layer4_weights = tf.get_variable("ClfW", shape=[num_hidden, num_labels],
        initializer=tf.contrib.layers.xavier_initializer())
    layer4_bias = tf.Variable(tf.zeros([num_labels], name="ClfB"))
    
    
    # Model.
    def model(data, dropout=False):
        
        split_data = tf.split(split_dim=2, num_split=5, value=data)
        digits = []
        
        for digit in split_data:
        
            # First convolutional layer
            conv1 = tf.nn.conv2d(digit, layer1_weights, [1, 1, 1, 1], padding='SAME')
            hidden1 = tf.nn.relu(conv1 + layer1_bias)
            pool1 = tf.nn.max_pool(hidden1, [1,2,2,1], [1,2,2,1], padding='SAME')
            
            # Second convolutional layer
            conv2 = tf.nn.conv2d(pool1, layer2_weights, [1, 1, 1, 1], padding='SAME')
            hidden2 = tf.nn.relu(conv2 + layer2_bias)
            pool2 = tf.nn.max_pool(hidden2, [1,2,2,1], [1,2,2,1], padding='SAME')
            
            # Flatten the tensor
            shape = pool2.get_shape().as_list()
            reshape = tf.reshape(pool2, [shape[0], shape[1] * shape[2] * shape[3]])
        
            # Dropout
            if dropout:
                reshape = tf.nn.dropout(reshape, keep_prob=0.6)
    
            hidden3 = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_bias)
    
            if dropout:
                hidden3 = tf.nn.dropout(hidden3, keep_prob=0.6)
        
            digits.append(tf.matmul(hidden3, layer4_weights) + layer4_bias)
            
        return digits
    
    #Training computation.
    train_logits = model(tf_train_dataset, dropout=True)
    loss = 0
    for i in range(5):
        loss += tf.reduce_mean(
            tf.nn.sparse_softmax_cross_entropy_with_logits(train_logits[i], tf_train_labels[:,i]))
    loss += beta*(tf.nn.l2_loss(layer1_weights)+tf.nn.l2_loss(layer2_weights)+tf.nn.l2_loss(layer3_weights)+tf.nn.l2_loss(layer4_weights))
    
    # Optimizer.
    global_step = tf.Variable(0)
    learning_rate = tf.train.exponential_decay(0.01, global_step, 
            decay_steps=500, decay_rate=0.95, staircase=True)
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)
  
    # Predictions for the training, validation, and test data.
    valid_logits = model(tf_valid_dataset)
    test_logits = model(tf_test_dataset)
    for i in range(5):
        valid_logits[i] = tf.nn.softmax(valid_logits[i])
        test_logits[i] = tf.nn.softmax(test_logits[i])
    train_prediction = tf.pack(train_logits)
    valid_prediction = tf.pack(valid_logits)
    test_prediction = tf.pack(test_logits)
    
    saver = tf.train.Saver()
    
num_steps = 10001

with tf.Session(graph=graph) as session:
    tf.global_variables_initializer().run()
    #saver.restore(session, tf.train.latest_checkpoint('./'))
    print('Initialized')
    
    for step in range(num_steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
        _, l, predictions = session.run(
          [optimizer, loss, train_prediction], feed_dict=feed_dict)
        if (step % 1000 == 0):
            print('Minibatch loss at step %d: %f' % (step, l))
            print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
            print('Validation accuracy: %.1f%%' % accuracy(
                valid_prediction.eval(), valid_labels))
    print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels))
    save_path = saver.save(session, "./tensorflowcheckpoints/sess")
    print('Model saved to {}'.format(save_path))
Initialized
Minibatch loss at step 0: 107.883690
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 1000: 3.003631
Minibatch accuracy: 71.9%
Validation accuracy: 87.8%
Minibatch loss at step 2000: 3.106699
Minibatch accuracy: 65.6%
Validation accuracy: 88.5%
Minibatch loss at step 3000: 2.951284
Minibatch accuracy: 62.5%
Validation accuracy: 85.2%
Minibatch loss at step 4000: 2.334377
Minibatch accuracy: 87.5%
Validation accuracy: 85.6%
Minibatch loss at step 5000: 2.820235
Minibatch accuracy: 62.5%
Validation accuracy: 89.6%
Minibatch loss at step 6000: 2.543198
Minibatch accuracy: 71.9%
Validation accuracy: 85.6%
Minibatch loss at step 7000: 2.770251
Minibatch accuracy: 65.6%
Validation accuracy: 85.1%
Minibatch loss at step 8000: 2.411674
Minibatch accuracy: 62.5%
Validation accuracy: 86.5%
Minibatch loss at step 9000: 3.147276
Minibatch accuracy: 68.8%
Validation accuracy: 86.8%
Minibatch loss at step 10000: 3.838606
Minibatch accuracy: 68.8%
Validation accuracy: 88.2%
Test accuracy: 85.8%
Model saved to ./tensorflowcheckpoints/sess

Question 1

What approach did you take in coming up with a solution to this problem?

Answer:

After loading the training and testing data, I created sequences of between 1 and 5 digits. The sequences are input into the model, which is a two-layer convolutional network. I chose a convolutional network because it is an effective tool for processing images.

The model splits the sequences into five separate digits and performs the convolutions and final connections independently, into 11 output classes representing the digits 0-9 and a blank digit, encoded with a 10.

Question 2

What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.)

Answer:

The final architecture has two 5x5 convolutions with 2x2 max pooling of stride 2, followed by a fully connected layer with dropout. L2 regularization is also used.

The optimizer is Adam optimizer. I chose this one because it seems to work better than gradient descent and the momentum optimizer.

Question 3

How did you train your model? How did you generate your synthetic dataset? Include examples of images from the synthetic data you constructed.

Answer:

I trained the model on sequences of variable length. The synthetic dataset is generated by the create_sets function, which selects a number between 1 and 5, and appends digits and labels from the raw dataset to numpy arrays. Some examples of training data and labels can be seen below.

In [17]:
def plotnum(data):
    plt.imshow(data)
    plt.show()

for _ in range(3):
    i = np.random.randint(100,200)
    plotnum(np.squeeze(train_dataset[i]))
    print(train_labels[i].astype(np.int32))
[ 5  7  4 10 10]
[ 4  4 10 10 10]
[ 9  3  9 10 10]

Step 2: Train a Model on a Realistic Dataset

Once you have settled on a good architecture, you can train your model on real data. In particular, the Street View House Numbers (SVHN) dataset is a good large-scale dataset collected from house numbers in Google Street View. Training on this more challenging dataset, where the digits are not neatly lined-up and have various skews, fonts and colors, likely means you have to do some hyperparameter exploration to perform well.

Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [2]:
import os
import sys
from six.moves.urllib.request import urlretrieve
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import idx2numpy
import time
from scipy.io import loadmat
In [19]:
train_data = loadmat('cropped_data/train_32x32.mat')
test_data = loadmat('cropped_data/test_32x32.mat')
In [20]:
# this converts the labels for zero to 0  [was 10]
ten_to_zero = np.vectorize(lambda x: 0 if x==10 else x)
train_data['y'] = ten_to_zero(train_data['y'])
In [21]:
train_dataset = train_data['X']
test_dataset = test_data['X']
train_labels = train_data['y']
test_labels = test_data['y']

# switch image index to first index
train_dataset = np.rollaxis(train_dataset, 3)
test_dataset = np.rollaxis(test_dataset, 3)
In [22]:
# Rearrange to [-1,1) 
train_dataset = (train_dataset / 128.0) - 1.0
test_dataset = (test_dataset / 128.0) - 1.0

# Create validation set
valid_dataset = train_dataset[0:5000, :, :, :]
valid_labels = train_labels[0:5000, :]
train_dataset = train_dataset[5000:, :, :, :]
train_labels = train_labels[5000:, :]

# Shrink Test Dataset because jupyter keeps crashing
valid_dataset = valid_dataset[:2000]
valid_labels = valid_labels[:2000]
test_dataset = test_dataset[:2000]
test_labels = test_labels[:2000]
In [23]:
print(train_dataset.shape, train_labels.shape)
print(valid_dataset.shape, valid_labels.shape)
print(test_dataset.shape, test_labels.shape)
(68257, 32, 32, 3) (68257, 1)
(2000, 32, 32, 3) (2000, 1)
(2000, 32, 32, 3) (2000, 1)
In [24]:
image_size = 32
num_channels = 3
num_labels = 10

def reformat_conv(dataset, labels):
  dataset = dataset.reshape(
    (-1, image_size, image_size, num_channels)).astype(np.float32)
  labels = (np.arange(num_labels) == labels[:,]).astype(np.float32)
  return dataset, labels
In [25]:
train_dataset, train_labels = reformat_conv(train_dataset, train_labels)
valid_dataset, valid_labels = reformat_conv(valid_dataset, valid_labels)
test_dataset, test_labels = reformat_conv(test_dataset, test_labels)
In [26]:
print(train_dataset.shape, train_labels.shape)
print(valid_dataset.shape, valid_labels.shape)
print(test_dataset.shape, test_labels.shape)
(68257, 32, 32, 3) (68257, 10)
(2000, 32, 32, 3) (2000, 10)
(2000, 32, 32, 3) (2000, 10)
In [27]:
def accuracy(preds, labels):
    return 100.0*( np.sum(np.argmax(preds,1) == np.argmax(labels,1)))/(preds.shape[0])
In [35]:
image_size = 32
num_channels = 3
depth1 = 16
depth2 = 32
depth3 = 64
depth4 = 128
kp_fc = 0.5
kp_conv = 0.9
depth5 = 256
num_classes = 10
beta=5e-5

batch_size = 20

patch_size = 4
num_labels = 10

graph = tf.Graph()

with graph.as_default():
    # Input
    tf_train_dataset = tf.placeholder(tf.float32, shape=(None,image_size,image_size,num_channels))
    tf_train_labels = tf.placeholder(tf.float32, shape=(None,num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.cast(tf.constant(test_dataset), tf.float32)
    
    # Variables
    weight_layer1 = tf.get_variable("ConvW1", shape=[patch_size, patch_size, 
        num_channels, depth1], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer1 = tf.Variable(tf.constant(1.0, shape=[depth1]))
    
    weight_layer2 = tf.get_variable("ConvW2", shape=[patch_size, patch_size, 
        depth1, depth2], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer2 = tf.Variable(tf.constant(1.0, shape=[depth2]))
    
    weight_layer3 = tf.get_variable("ConvW3", shape=[patch_size, patch_size, 
        depth2, depth3], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer3 = tf.Variable(tf.constant(1.0, shape=[depth3]))
    
    weight_layer4 = tf.get_variable("ConvW4", shape=[patch_size, patch_size, 
        depth3, depth4], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer4 = tf.Variable(tf.constant(1.0, shape=[depth4]))
    
    weight_layer5 = tf.get_variable("FcW1", shape=[2 * 2 * depth4, depth5], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer5 = tf.Variable(tf.constant(1.0, shape=[depth5]))
    
    weight_layer6 = tf.get_variable("FcW2", shape=[depth5, num_classes], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer6 = tf.Variable(tf.constant(1.0, shape=[num_classes]))
    
    # Model
    def model(data, dropout=False):
        # Convolution 1
        conv_1 = tf.nn.conv2d(data, weight_layer1, [1,1,1,1], padding='SAME')
        hidden_1 = tf.nn.relu(conv_1 + bias_layer1)
        pool_1 = tf.nn.max_pool(hidden_1, [1,2,2,1], [1,2,2,1], padding='SAME')
        
        if dropout:
            pool_1 = tf.nn.dropout(pool_1, keep_prob=kp_conv)
        
        # Convolution 2
        conv_2 = tf.nn.conv2d(pool_1, weight_layer2, [1,1,1,1], padding='SAME')
        hidden_2 = tf.nn.relu(conv_2 + bias_layer2)
        pool_2 = tf.nn.max_pool(hidden_2, [1,2,2,1], [1,2,2,1], padding='SAME')
        
        if dropout:
            pool_2 = tf.nn.dropout(pool_2, keep_prob=kp_conv)
        
        # Convolution 3
        conv_3 = tf.nn.conv2d(pool_2, weight_layer3, [1,1,1,1], padding='SAME')
        hidden_3 = tf.nn.relu(conv_3 + bias_layer3)
        pool_3 = tf.nn.max_pool(hidden_3, [1,2,2,1], [1,2,2,1], padding='SAME')
        
        if dropout:
            pool_3 = tf.nn.dropout(pool_3, keep_prob=kp_conv)
            
        # Convolution 4
        conv_4 = tf.nn.conv2d(pool_3, weight_layer4, [1,1,1,1], padding='SAME')
        hidden_4 = tf.nn.relu(conv_4 + bias_layer4)
        pool_4 = tf.nn.max_pool(hidden_4, [1,2,2,1], [1,2,2,1], padding='SAME')
        
        # Reshape
        shape = pool_4.get_shape().as_list()
        reshape = tf.reshape(pool_4, [-1, shape[2] * shape[3] * shape[1]])
        
        if dropout:
            reshape = tf.nn.dropout(reshape, keep_prob=kp_fc)
        
        # Fully Connected
        fc = tf.nn.relu(tf.matmul(reshape, weight_layer5) + bias_layer5)
        
        # Output Classes
        return tf.matmul(fc, weight_layer6) + bias_layer6
        
    # Training computation.
    logits = model(tf_train_dataset, dropout=True)
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))
    loss += beta*(tf.nn.l2_loss(weight_layer1)+tf.nn.l2_loss(weight_layer2)+tf.nn.l2_loss(weight_layer3)+tf.nn.l2_loss(weight_layer4)+tf.nn.l2_loss(weight_layer5)+tf.nn.l2_loss(weight_layer6))
    
    # Optimizer.
    global_step = tf.Variable(0)
    learning_rate = tf.train.exponential_decay(0.001, global_step, 
                decay_steps=10000, decay_rate=0.95, staircase=True)
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)

    # Predictions for the training, validation, and test data.
    train_prediction = tf.nn.softmax(logits)
    valid_prediction = tf.nn.softmax(model(tf_valid_dataset))
    test_prediction = tf.nn.softmax(model(tf_test_dataset))
    
    saver = tf.train.Saver()

num_steps = 200001

with tf.Session(graph=graph) as session:
    tf.global_variables_initializer().run()
    print('Initialized')
    for step in range(num_steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
        _, l, predictions = session.run(
          [optimizer, loss, train_prediction], feed_dict=feed_dict)
        if (step % 500 == 0):
            print('Minibatch loss at step %d: %f' % (step, l))
            print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
        if (step % 2000 == 0):
            print('Validation accuracy: %.1f%%' % accuracy(
                valid_prediction.eval(), valid_labels))
    print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels))
    save_path = saver.save(session, "./tensorflowcheckpoints/svhn")
    print('Model saved to {}'.format(save_path))
Initialized
Minibatch loss at step 0: 8.464724
Minibatch accuracy: 15.0%
Validation accuracy: 13.1%
Minibatch loss at step 500: 2.315315
Minibatch accuracy: 10.0%
Minibatch loss at step 1000: 2.222620
Minibatch accuracy: 15.0%
Minibatch loss at step 1500: 2.417467
Minibatch accuracy: 5.0%
Minibatch loss at step 2000: 2.335016
Minibatch accuracy: 10.0%
Validation accuracy: 11.7%
Minibatch loss at step 2500: 2.178340
Minibatch accuracy: 20.0%
Minibatch loss at step 3000: 2.302956
Minibatch accuracy: 10.0%
Minibatch loss at step 3500: 2.280455
Minibatch accuracy: 15.0%
Minibatch loss at step 4000: 2.345700
Minibatch accuracy: 20.0%
Validation accuracy: 20.4%
Minibatch loss at step 4500: 2.168889
Minibatch accuracy: 25.0%
Minibatch loss at step 5000: 2.273650
Minibatch accuracy: 10.0%
Minibatch loss at step 5500: 2.313999
Minibatch accuracy: 10.0%
Minibatch loss at step 6000: 2.279463
Minibatch accuracy: 20.0%
Validation accuracy: 20.4%
Minibatch loss at step 6500: 2.564659
Minibatch accuracy: 0.0%
Minibatch loss at step 7000: 2.183697
Minibatch accuracy: 15.0%
Minibatch loss at step 7500: 2.387326
Minibatch accuracy: 20.0%
Minibatch loss at step 8000: 2.526532
Minibatch accuracy: 5.0%
Validation accuracy: 20.4%
Minibatch loss at step 8500: 2.272584
Minibatch accuracy: 10.0%
Minibatch loss at step 9000: 2.071527
Minibatch accuracy: 30.0%
Minibatch loss at step 9500: 2.288290
Minibatch accuracy: 15.0%
Minibatch loss at step 10000: 2.280718
Minibatch accuracy: 20.0%
Validation accuracy: 10.6%
Minibatch loss at step 10500: 2.375622
Minibatch accuracy: 20.0%
Minibatch loss at step 11000: 2.163999
Minibatch accuracy: 30.0%
Minibatch loss at step 11500: 2.348579
Minibatch accuracy: 10.0%
Minibatch loss at step 12000: 2.214376
Minibatch accuracy: 15.0%
Validation accuracy: 11.7%
Minibatch loss at step 12500: 2.083071
Minibatch accuracy: 30.0%
Minibatch loss at step 13000: 2.276244
Minibatch accuracy: 20.0%
Minibatch loss at step 13500: 2.301828
Minibatch accuracy: 15.0%
Minibatch loss at step 14000: 2.244377
Minibatch accuracy: 35.0%
Validation accuracy: 20.4%
Minibatch loss at step 14500: 2.129229
Minibatch accuracy: 35.0%
Minibatch loss at step 15000: 2.289676
Minibatch accuracy: 0.0%
Minibatch loss at step 15500: 2.206392
Minibatch accuracy: 25.0%
Minibatch loss at step 16000: 2.209452
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 16500: 2.065497
Minibatch accuracy: 20.0%
Minibatch loss at step 17000: 2.396724
Minibatch accuracy: 10.0%
Minibatch loss at step 17500: 2.305950
Minibatch accuracy: 20.0%
Minibatch loss at step 18000: 2.196406
Minibatch accuracy: 30.0%
Validation accuracy: 20.4%
Minibatch loss at step 18500: 2.312397
Minibatch accuracy: 15.0%
Minibatch loss at step 19000: 2.150339
Minibatch accuracy: 30.0%
Minibatch loss at step 19500: 2.287873
Minibatch accuracy: 15.0%
Minibatch loss at step 20000: 2.252942
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 20500: 2.296975
Minibatch accuracy: 40.0%
Minibatch loss at step 21000: 2.232469
Minibatch accuracy: 15.0%
Minibatch loss at step 21500: 2.198601
Minibatch accuracy: 35.0%
Minibatch loss at step 22000: 2.371152
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 22500: 2.179210
Minibatch accuracy: 35.0%
Minibatch loss at step 23000: 2.320178
Minibatch accuracy: 0.0%
Minibatch loss at step 23500: 2.341861
Minibatch accuracy: 20.0%
Minibatch loss at step 24000: 2.293698
Minibatch accuracy: 10.0%
Validation accuracy: 20.4%
Minibatch loss at step 24500: 2.526417
Minibatch accuracy: 0.0%
Minibatch loss at step 25000: 2.350042
Minibatch accuracy: 15.0%
Minibatch loss at step 25500: 2.207563
Minibatch accuracy: 10.0%
Minibatch loss at step 26000: 2.346906
Minibatch accuracy: 10.0%
Validation accuracy: 20.4%
Minibatch loss at step 26500: 2.425871
Minibatch accuracy: 10.0%
Minibatch loss at step 27000: 2.245231
Minibatch accuracy: 25.0%
Minibatch loss at step 27500: 2.105148
Minibatch accuracy: 30.0%
Minibatch loss at step 28000: 2.198267
Minibatch accuracy: 25.0%
Validation accuracy: 11.7%
Minibatch loss at step 28500: 2.297769
Minibatch accuracy: 10.0%
Minibatch loss at step 29000: 2.284580
Minibatch accuracy: 15.0%
Minibatch loss at step 29500: 2.264293
Minibatch accuracy: 10.0%
Minibatch loss at step 30000: 2.268328
Minibatch accuracy: 25.0%
Validation accuracy: 11.7%
Minibatch loss at step 30500: 2.353521
Minibatch accuracy: 10.0%
Minibatch loss at step 31000: 2.258246
Minibatch accuracy: 15.0%
Minibatch loss at step 31500: 2.321420
Minibatch accuracy: 15.0%
Minibatch loss at step 32000: 2.247659
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 32500: 2.178431
Minibatch accuracy: 30.0%
Minibatch loss at step 33000: 2.212223
Minibatch accuracy: 15.0%
Minibatch loss at step 33500: 2.112572
Minibatch accuracy: 30.0%
Minibatch loss at step 34000: 2.250037
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 34500: 2.398123
Minibatch accuracy: 0.0%
Minibatch loss at step 35000: 2.321697
Minibatch accuracy: 10.0%
Minibatch loss at step 35500: 2.275139
Minibatch accuracy: 10.0%
Minibatch loss at step 36000: 2.213479
Minibatch accuracy: 10.0%
Validation accuracy: 11.7%
Minibatch loss at step 36500: 2.119661
Minibatch accuracy: 25.0%
Minibatch loss at step 37000: 2.148446
Minibatch accuracy: 15.0%
Minibatch loss at step 37500: 2.249366
Minibatch accuracy: 10.0%
Minibatch loss at step 38000: 2.234845
Minibatch accuracy: 20.0%
Validation accuracy: 20.4%
Minibatch loss at step 38500: 2.154729
Minibatch accuracy: 25.0%
Minibatch loss at step 39000: 2.289870
Minibatch accuracy: 15.0%
Minibatch loss at step 39500: 2.214091
Minibatch accuracy: 30.0%
Minibatch loss at step 40000: 2.159009
Minibatch accuracy: 35.0%
Validation accuracy: 20.4%
Minibatch loss at step 40500: 2.179799
Minibatch accuracy: 15.0%
Minibatch loss at step 41000: 2.232214
Minibatch accuracy: 20.0%
Minibatch loss at step 41500: 2.188856
Minibatch accuracy: 25.0%
Minibatch loss at step 42000: 2.267801
Minibatch accuracy: 20.0%
Validation accuracy: 20.4%
Minibatch loss at step 42500: 2.329928
Minibatch accuracy: 15.0%
Minibatch loss at step 43000: 2.168795
Minibatch accuracy: 30.0%
Minibatch loss at step 43500: 2.160116
Minibatch accuracy: 25.0%
Minibatch loss at step 44000: 2.343309
Minibatch accuracy: 5.0%
Validation accuracy: 20.4%
Minibatch loss at step 44500: 2.088958
Minibatch accuracy: 30.0%
Minibatch loss at step 45000: 2.456525
Minibatch accuracy: 5.0%
Minibatch loss at step 45500: 2.198148
Minibatch accuracy: 20.0%
Minibatch loss at step 46000: 2.138525
Minibatch accuracy: 20.0%
Validation accuracy: 20.4%
Minibatch loss at step 46500: 2.316437
Minibatch accuracy: 10.0%
Minibatch loss at step 47000: 2.262851
Minibatch accuracy: 5.0%
Minibatch loss at step 47500: 2.255299
Minibatch accuracy: 20.0%
Minibatch loss at step 48000: 2.271548
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 48500: 2.315410
Minibatch accuracy: 5.0%
Minibatch loss at step 49000: 2.237458
Minibatch accuracy: 20.0%
Minibatch loss at step 49500: 2.251611
Minibatch accuracy: 15.0%
Minibatch loss at step 50000: 2.004283
Minibatch accuracy: 50.0%
Validation accuracy: 20.4%
Minibatch loss at step 50500: 2.125087
Minibatch accuracy: 40.0%
Minibatch loss at step 51000: 2.262375
Minibatch accuracy: 20.0%
Minibatch loss at step 51500: 2.303635
Minibatch accuracy: 20.0%
Minibatch loss at step 52000: 2.294367
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 52500: 2.193917
Minibatch accuracy: 25.0%
Minibatch loss at step 53000: 2.261641
Minibatch accuracy: 0.0%
Minibatch loss at step 53500: 2.229027
Minibatch accuracy: 25.0%
Minibatch loss at step 54000: 2.260987
Minibatch accuracy: 30.0%
Validation accuracy: 20.4%
Minibatch loss at step 54500: 2.205821
Minibatch accuracy: 20.0%
Minibatch loss at step 55000: 2.290090
Minibatch accuracy: 20.0%
Minibatch loss at step 55500: 2.220389
Minibatch accuracy: 15.0%
Minibatch loss at step 56000: 2.245391
Minibatch accuracy: 20.0%
Validation accuracy: 20.4%
Minibatch loss at step 56500: 2.276308
Minibatch accuracy: 10.0%
Minibatch loss at step 57000: 2.345555
Minibatch accuracy: 10.0%
Minibatch loss at step 57500: 2.209274
Minibatch accuracy: 30.0%
Minibatch loss at step 58000: 2.229339
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 58500: 2.322537
Minibatch accuracy: 0.0%
Minibatch loss at step 59000: 2.269896
Minibatch accuracy: 20.0%
Minibatch loss at step 59500: 2.284218
Minibatch accuracy: 15.0%
Minibatch loss at step 60000: 2.429593
Minibatch accuracy: 5.0%
Validation accuracy: 20.4%
Minibatch loss at step 60500: 2.282446
Minibatch accuracy: 15.0%
Minibatch loss at step 61000: 2.350346
Minibatch accuracy: 5.0%
Minibatch loss at step 61500: 2.166459
Minibatch accuracy: 20.0%
Minibatch loss at step 62000: 2.273195
Minibatch accuracy: 15.0%
Validation accuracy: 20.4%
Minibatch loss at step 62500: 2.286956
Minibatch accuracy: 15.0%
Minibatch loss at step 63000: 2.291474
Minibatch accuracy: 10.0%
Minibatch loss at step 63500: 2.259618
Minibatch accuracy: 15.0%
Minibatch loss at step 64000: 2.081769
Minibatch accuracy: 30.0%
Validation accuracy: 25.4%
Minibatch loss at step 64500: 1.953902
Minibatch accuracy: 25.0%
Minibatch loss at step 65000: 1.573736
Minibatch accuracy: 35.0%
Minibatch loss at step 65500: 0.861632
Minibatch accuracy: 65.0%
Minibatch loss at step 66000: 1.124906
Minibatch accuracy: 60.0%
Validation accuracy: 72.8%
Minibatch loss at step 66500: 0.646184
Minibatch accuracy: 90.0%
Minibatch loss at step 67000: 1.054070
Minibatch accuracy: 65.0%
Minibatch loss at step 67500: 0.584079
Minibatch accuracy: 85.0%
Minibatch loss at step 68000: 0.513942
Minibatch accuracy: 85.0%
Validation accuracy: 85.1%
Minibatch loss at step 68500: 0.692636
Minibatch accuracy: 85.0%
Minibatch loss at step 69000: 0.850657
Minibatch accuracy: 70.0%
Minibatch loss at step 69500: 0.554620
Minibatch accuracy: 80.0%
Minibatch loss at step 70000: 0.371891
Minibatch accuracy: 80.0%
Validation accuracy: 87.2%
Minibatch loss at step 70500: 0.560263
Minibatch accuracy: 75.0%
Minibatch loss at step 71000: 0.392128
Minibatch accuracy: 90.0%
Minibatch loss at step 71500: 0.339975
Minibatch accuracy: 95.0%
Minibatch loss at step 72000: 0.301290
Minibatch accuracy: 90.0%
Validation accuracy: 88.2%
Minibatch loss at step 72500: 0.190545
Minibatch accuracy: 95.0%
Minibatch loss at step 73000: 0.438128
Minibatch accuracy: 85.0%
Minibatch loss at step 73500: 0.516519
Minibatch accuracy: 90.0%
Minibatch loss at step 74000: 0.189988
Minibatch accuracy: 95.0%
Validation accuracy: 89.8%
Minibatch loss at step 74500: 0.090351
Minibatch accuracy: 100.0%
Minibatch loss at step 75000: 0.189134
Minibatch accuracy: 95.0%
Minibatch loss at step 75500: 0.425572
Minibatch accuracy: 85.0%
Minibatch loss at step 76000: 0.532361
Minibatch accuracy: 75.0%
Validation accuracy: 89.2%
Minibatch loss at step 76500: 0.582587
Minibatch accuracy: 85.0%
Minibatch loss at step 77000: 0.636158
Minibatch accuracy: 70.0%
Minibatch loss at step 77500: 0.337975
Minibatch accuracy: 85.0%
Minibatch loss at step 78000: 0.571636
Minibatch accuracy: 75.0%
Validation accuracy: 89.2%
Minibatch loss at step 78500: 0.644371
Minibatch accuracy: 80.0%
Minibatch loss at step 79000: 0.293078
Minibatch accuracy: 95.0%
Minibatch loss at step 79500: 0.402712
Minibatch accuracy: 85.0%
Minibatch loss at step 80000: 1.238832
Minibatch accuracy: 80.0%
Validation accuracy: 90.8%
Minibatch loss at step 80500: 0.444505
Minibatch accuracy: 85.0%
Minibatch loss at step 81000: 0.385672
Minibatch accuracy: 90.0%
Minibatch loss at step 81500: 0.290404
Minibatch accuracy: 95.0%
Minibatch loss at step 82000: 0.230766
Minibatch accuracy: 90.0%
Validation accuracy: 89.7%
Minibatch loss at step 82500: 0.398317
Minibatch accuracy: 85.0%
Minibatch loss at step 83000: 0.561991
Minibatch accuracy: 90.0%
Minibatch loss at step 83500: 0.643703
Minibatch accuracy: 85.0%
Minibatch loss at step 84000: 0.621197
Minibatch accuracy: 80.0%
Validation accuracy: 90.2%
Minibatch loss at step 84500: 0.715592
Minibatch accuracy: 85.0%
Minibatch loss at step 85000: 0.080342
Minibatch accuracy: 100.0%
Minibatch loss at step 85500: 0.331776
Minibatch accuracy: 95.0%
Minibatch loss at step 86000: 0.634930
Minibatch accuracy: 85.0%
Validation accuracy: 90.8%
Minibatch loss at step 86500: 0.475977
Minibatch accuracy: 85.0%
Minibatch loss at step 87000: 0.653022
Minibatch accuracy: 80.0%
Minibatch loss at step 87500: 0.193470
Minibatch accuracy: 90.0%
Minibatch loss at step 88000: 0.877436
Minibatch accuracy: 65.0%
Validation accuracy: 90.9%
Minibatch loss at step 88500: 0.238418
Minibatch accuracy: 90.0%
Minibatch loss at step 89000: 0.537105
Minibatch accuracy: 85.0%
Minibatch loss at step 89500: 0.485584
Minibatch accuracy: 95.0%
Minibatch loss at step 90000: 0.182811
Minibatch accuracy: 100.0%
Validation accuracy: 91.5%
Minibatch loss at step 90500: 0.181648
Minibatch accuracy: 95.0%
Minibatch loss at step 91000: 0.670557
Minibatch accuracy: 80.0%
Minibatch loss at step 91500: 0.432267
Minibatch accuracy: 85.0%
Minibatch loss at step 92000: 0.635297
Minibatch accuracy: 85.0%
Validation accuracy: 91.0%
Minibatch loss at step 92500: 0.327953
Minibatch accuracy: 95.0%
Minibatch loss at step 93000: 0.672638
Minibatch accuracy: 80.0%
Minibatch loss at step 93500: 0.199530
Minibatch accuracy: 100.0%
Minibatch loss at step 94000: 0.268592
Minibatch accuracy: 90.0%
Validation accuracy: 91.0%
Minibatch loss at step 94500: 0.270211
Minibatch accuracy: 95.0%
Minibatch loss at step 95000: 0.299079
Minibatch accuracy: 95.0%
Minibatch loss at step 95500: 0.598413
Minibatch accuracy: 85.0%
Minibatch loss at step 96000: 0.376983
Minibatch accuracy: 90.0%
Validation accuracy: 91.2%
Minibatch loss at step 96500: 0.979354
Minibatch accuracy: 75.0%
Minibatch loss at step 97000: 0.286049
Minibatch accuracy: 95.0%
Minibatch loss at step 97500: 0.293130
Minibatch accuracy: 95.0%
Minibatch loss at step 98000: 0.605343
Minibatch accuracy: 80.0%
Validation accuracy: 91.7%
Minibatch loss at step 98500: 0.127297
Minibatch accuracy: 100.0%
Minibatch loss at step 99000: 0.294558
Minibatch accuracy: 95.0%
Minibatch loss at step 99500: 0.202106
Minibatch accuracy: 95.0%
Minibatch loss at step 100000: 0.503861
Minibatch accuracy: 90.0%
Validation accuracy: 91.5%
Minibatch loss at step 100500: 0.277443
Minibatch accuracy: 85.0%
Minibatch loss at step 101000: 0.176820
Minibatch accuracy: 100.0%
Minibatch loss at step 101500: 0.107996
Minibatch accuracy: 100.0%
Minibatch loss at step 102000: 0.176062
Minibatch accuracy: 95.0%
Validation accuracy: 91.2%
Minibatch loss at step 102500: 0.525340
Minibatch accuracy: 90.0%
Minibatch loss at step 103000: 0.499330
Minibatch accuracy: 80.0%
Minibatch loss at step 103500: 0.421712
Minibatch accuracy: 85.0%
Minibatch loss at step 104000: 0.550010
Minibatch accuracy: 85.0%
Validation accuracy: 91.8%
Minibatch loss at step 104500: 0.123061
Minibatch accuracy: 100.0%
Minibatch loss at step 105000: 0.460997
Minibatch accuracy: 85.0%
Minibatch loss at step 105500: 0.794576
Minibatch accuracy: 80.0%
Minibatch loss at step 106000: 0.993741
Minibatch accuracy: 70.0%
Validation accuracy: 91.7%
Minibatch loss at step 106500: 0.274498
Minibatch accuracy: 95.0%
Minibatch loss at step 107000: 0.218424
Minibatch accuracy: 100.0%
Minibatch loss at step 107500: 0.310407
Minibatch accuracy: 90.0%
Minibatch loss at step 108000: 0.478306
Minibatch accuracy: 80.0%
Validation accuracy: 90.7%
Minibatch loss at step 108500: 0.218409
Minibatch accuracy: 95.0%
Minibatch loss at step 109000: 0.238762
Minibatch accuracy: 90.0%
Minibatch loss at step 109500: 0.363691
Minibatch accuracy: 85.0%
Minibatch loss at step 110000: 0.519417
Minibatch accuracy: 80.0%
Validation accuracy: 91.8%
Minibatch loss at step 110500: 0.575315
Minibatch accuracy: 75.0%
Minibatch loss at step 111000: 0.152417
Minibatch accuracy: 100.0%
Minibatch loss at step 111500: 0.305837
Minibatch accuracy: 85.0%
Minibatch loss at step 112000: 0.368571
Minibatch accuracy: 90.0%
Validation accuracy: 91.8%
Minibatch loss at step 112500: 0.412532
Minibatch accuracy: 90.0%
Minibatch loss at step 113000: 0.385887
Minibatch accuracy: 90.0%
Minibatch loss at step 113500: 0.116934
Minibatch accuracy: 100.0%
Minibatch loss at step 114000: 0.161010
Minibatch accuracy: 100.0%
Validation accuracy: 92.2%
Minibatch loss at step 114500: 0.391174
Minibatch accuracy: 95.0%
Minibatch loss at step 115000: 0.322339
Minibatch accuracy: 90.0%
Minibatch loss at step 115500: 0.161341
Minibatch accuracy: 95.0%
Minibatch loss at step 116000: 0.381322
Minibatch accuracy: 85.0%
Validation accuracy: 92.2%
Minibatch loss at step 116500: 0.415241
Minibatch accuracy: 90.0%
Minibatch loss at step 117000: 0.625292
Minibatch accuracy: 90.0%
Minibatch loss at step 117500: 0.750222
Minibatch accuracy: 85.0%
Minibatch loss at step 118000: 0.386792
Minibatch accuracy: 90.0%
Validation accuracy: 91.8%
Minibatch loss at step 118500: 0.139909
Minibatch accuracy: 100.0%
Minibatch loss at step 119000: 0.468924
Minibatch accuracy: 80.0%
Minibatch loss at step 119500: 0.291084
Minibatch accuracy: 90.0%
Minibatch loss at step 120000: 0.313335
Minibatch accuracy: 90.0%
Validation accuracy: 92.2%
Minibatch loss at step 120500: 0.689867
Minibatch accuracy: 75.0%
Minibatch loss at step 121000: 0.392144
Minibatch accuracy: 85.0%
Minibatch loss at step 121500: 0.270211
Minibatch accuracy: 95.0%
Minibatch loss at step 122000: 0.324832
Minibatch accuracy: 95.0%
Validation accuracy: 91.5%
Minibatch loss at step 122500: 0.205797
Minibatch accuracy: 95.0%
Minibatch loss at step 123000: 0.086225
Minibatch accuracy: 100.0%
Minibatch loss at step 123500: 0.554114
Minibatch accuracy: 80.0%
Minibatch loss at step 124000: 0.093732
Minibatch accuracy: 100.0%
Validation accuracy: 91.2%
Minibatch loss at step 124500: 1.010711
Minibatch accuracy: 80.0%
Minibatch loss at step 125000: 0.514219
Minibatch accuracy: 80.0%
Minibatch loss at step 125500: 0.788236
Minibatch accuracy: 75.0%
Minibatch loss at step 126000: 0.377423
Minibatch accuracy: 85.0%
Validation accuracy: 92.2%
Minibatch loss at step 126500: 0.440743
Minibatch accuracy: 85.0%
Minibatch loss at step 127000: 0.195664
Minibatch accuracy: 95.0%
Minibatch loss at step 127500: 0.133986
Minibatch accuracy: 100.0%
Minibatch loss at step 128000: 0.092187
Minibatch accuracy: 100.0%
Validation accuracy: 92.2%
Minibatch loss at step 128500: 0.376909
Minibatch accuracy: 85.0%
Minibatch loss at step 129000: 0.105213
Minibatch accuracy: 100.0%
Minibatch loss at step 129500: 0.426292
Minibatch accuracy: 95.0%
Minibatch loss at step 130000: 0.253718
Minibatch accuracy: 90.0%
Validation accuracy: 92.0%
Minibatch loss at step 130500: 0.410267
Minibatch accuracy: 90.0%
Minibatch loss at step 131000: 1.198741
Minibatch accuracy: 75.0%
Minibatch loss at step 131500: 0.335962
Minibatch accuracy: 90.0%
Minibatch loss at step 132000: 0.127819
Minibatch accuracy: 95.0%
Validation accuracy: 92.0%
Minibatch loss at step 132500: 0.172462
Minibatch accuracy: 100.0%
Minibatch loss at step 133000: 0.695742
Minibatch accuracy: 85.0%
Minibatch loss at step 133500: 0.314540
Minibatch accuracy: 90.0%
Minibatch loss at step 134000: 1.665162
Minibatch accuracy: 70.0%
Validation accuracy: 92.0%
Minibatch loss at step 134500: 0.700660
Minibatch accuracy: 80.0%
Minibatch loss at step 135000: 0.241946
Minibatch accuracy: 95.0%
Minibatch loss at step 135500: 0.131146
Minibatch accuracy: 95.0%
Minibatch loss at step 136000: 0.542124
Minibatch accuracy: 80.0%
Validation accuracy: 91.8%
Minibatch loss at step 136500: 0.411938
Minibatch accuracy: 90.0%
Minibatch loss at step 137000: 0.153701
Minibatch accuracy: 100.0%
Minibatch loss at step 137500: 0.330482
Minibatch accuracy: 95.0%
Minibatch loss at step 138000: 0.422381
Minibatch accuracy: 90.0%
Validation accuracy: 93.0%
Minibatch loss at step 138500: 0.380280
Minibatch accuracy: 90.0%
Minibatch loss at step 139000: 0.422648
Minibatch accuracy: 95.0%
Minibatch loss at step 139500: 1.095657
Minibatch accuracy: 70.0%
Minibatch loss at step 140000: 0.261371
Minibatch accuracy: 95.0%
Validation accuracy: 92.5%
Minibatch loss at step 140500: 0.143032
Minibatch accuracy: 100.0%
Minibatch loss at step 141000: 0.284447
Minibatch accuracy: 95.0%
Minibatch loss at step 141500: 0.617295
Minibatch accuracy: 85.0%
Minibatch loss at step 142000: 0.104247
Minibatch accuracy: 100.0%
Validation accuracy: 92.7%
Minibatch loss at step 142500: 0.318709
Minibatch accuracy: 95.0%
Minibatch loss at step 143000: 0.219759
Minibatch accuracy: 95.0%
Minibatch loss at step 143500: 0.474518
Minibatch accuracy: 80.0%
Minibatch loss at step 144000: 0.462234
Minibatch accuracy: 80.0%
Validation accuracy: 92.5%
Minibatch loss at step 144500: 0.259090
Minibatch accuracy: 95.0%
Minibatch loss at step 145000: 0.184238
Minibatch accuracy: 95.0%
Minibatch loss at step 145500: 0.133537
Minibatch accuracy: 100.0%
Minibatch loss at step 146000: 0.738296
Minibatch accuracy: 85.0%
Validation accuracy: 92.2%
Minibatch loss at step 146500: 0.260382
Minibatch accuracy: 90.0%
Minibatch loss at step 147000: 0.158496
Minibatch accuracy: 95.0%
Minibatch loss at step 147500: 0.412879
Minibatch accuracy: 80.0%
Minibatch loss at step 148000: 0.476031
Minibatch accuracy: 90.0%
Validation accuracy: 92.3%
Minibatch loss at step 148500: 0.212167
Minibatch accuracy: 95.0%
Minibatch loss at step 149000: 0.423828
Minibatch accuracy: 90.0%
Minibatch loss at step 149500: 0.420455
Minibatch accuracy: 90.0%
Minibatch loss at step 150000: 0.622015
Minibatch accuracy: 90.0%
Validation accuracy: 92.5%
Minibatch loss at step 150500: 0.109907
Minibatch accuracy: 100.0%
Minibatch loss at step 151000: 0.288439
Minibatch accuracy: 90.0%
Minibatch loss at step 151500: 0.383128
Minibatch accuracy: 90.0%
Minibatch loss at step 152000: 0.185597
Minibatch accuracy: 95.0%
Validation accuracy: 92.4%
Minibatch loss at step 152500: 0.590986
Minibatch accuracy: 75.0%
Minibatch loss at step 153000: 0.241097
Minibatch accuracy: 90.0%
Minibatch loss at step 153500: 0.306007
Minibatch accuracy: 95.0%
Minibatch loss at step 154000: 0.181076
Minibatch accuracy: 95.0%
Validation accuracy: 92.3%
Minibatch loss at step 154500: 0.212902
Minibatch accuracy: 100.0%
Minibatch loss at step 155000: 0.413910
Minibatch accuracy: 85.0%
Minibatch loss at step 155500: 0.282455
Minibatch accuracy: 90.0%
Minibatch loss at step 156000: 0.334971
Minibatch accuracy: 90.0%
Validation accuracy: 91.7%
Minibatch loss at step 156500: 0.262023
Minibatch accuracy: 95.0%
Minibatch loss at step 157000: 0.323860
Minibatch accuracy: 85.0%
Minibatch loss at step 157500: 0.186162
Minibatch accuracy: 95.0%
Minibatch loss at step 158000: 0.302121
Minibatch accuracy: 90.0%
Validation accuracy: 92.7%
Minibatch loss at step 158500: 0.108196
Minibatch accuracy: 100.0%
Minibatch loss at step 159000: 0.387241
Minibatch accuracy: 95.0%
Minibatch loss at step 159500: 0.309318
Minibatch accuracy: 95.0%
Minibatch loss at step 160000: 0.590476
Minibatch accuracy: 85.0%
Validation accuracy: 92.7%
Minibatch loss at step 160500: 0.350839
Minibatch accuracy: 85.0%
Minibatch loss at step 161000: 0.494779
Minibatch accuracy: 90.0%
Minibatch loss at step 161500: 0.843295
Minibatch accuracy: 80.0%
Minibatch loss at step 162000: 0.480400
Minibatch accuracy: 95.0%
Validation accuracy: 91.7%
Minibatch loss at step 162500: 0.302751
Minibatch accuracy: 90.0%
Minibatch loss at step 163000: 0.355187
Minibatch accuracy: 95.0%
Minibatch loss at step 163500: 0.234038
Minibatch accuracy: 100.0%
Minibatch loss at step 164000: 0.113648
Minibatch accuracy: 100.0%
Validation accuracy: 93.3%
Minibatch loss at step 164500: 0.118060
Minibatch accuracy: 100.0%
Minibatch loss at step 165000: 0.384543
Minibatch accuracy: 85.0%
Minibatch loss at step 165500: 0.266143
Minibatch accuracy: 90.0%
Minibatch loss at step 166000: 0.242441
Minibatch accuracy: 95.0%
Validation accuracy: 92.3%
Minibatch loss at step 166500: 0.225974
Minibatch accuracy: 95.0%
Minibatch loss at step 167000: 0.605863
Minibatch accuracy: 85.0%
Minibatch loss at step 167500: 0.103332
Minibatch accuracy: 100.0%
Minibatch loss at step 168000: 0.083588
Minibatch accuracy: 100.0%
Validation accuracy: 92.6%
Minibatch loss at step 168500: 0.407843
Minibatch accuracy: 85.0%
Minibatch loss at step 169000: 0.855228
Minibatch accuracy: 75.0%
Minibatch loss at step 169500: 0.728947
Minibatch accuracy: 75.0%
Minibatch loss at step 170000: 0.632734
Minibatch accuracy: 80.0%
Validation accuracy: 92.7%
Minibatch loss at step 170500: 0.442844
Minibatch accuracy: 90.0%
Minibatch loss at step 171000: 0.457542
Minibatch accuracy: 90.0%
Minibatch loss at step 171500: 0.512969
Minibatch accuracy: 95.0%
Minibatch loss at step 172000: 0.619563
Minibatch accuracy: 85.0%
Validation accuracy: 93.0%
Minibatch loss at step 172500: 0.111770
Minibatch accuracy: 100.0%
Minibatch loss at step 173000: 0.312453
Minibatch accuracy: 90.0%
Minibatch loss at step 173500: 0.301364
Minibatch accuracy: 90.0%
Minibatch loss at step 174000: 0.462364
Minibatch accuracy: 80.0%
Validation accuracy: 92.2%
Minibatch loss at step 174500: 0.466927
Minibatch accuracy: 90.0%
Minibatch loss at step 175000: 0.216862
Minibatch accuracy: 95.0%
Minibatch loss at step 175500: 0.261265
Minibatch accuracy: 95.0%
Minibatch loss at step 176000: 0.704458
Minibatch accuracy: 75.0%
Validation accuracy: 92.4%
Minibatch loss at step 176500: 0.149020
Minibatch accuracy: 95.0%
Minibatch loss at step 177000: 0.357154
Minibatch accuracy: 90.0%
Minibatch loss at step 177500: 0.389771
Minibatch accuracy: 85.0%
Minibatch loss at step 178000: 0.173678
Minibatch accuracy: 100.0%
Validation accuracy: 92.7%
Minibatch loss at step 178500: 0.260392
Minibatch accuracy: 95.0%
Minibatch loss at step 179000: 0.631585
Minibatch accuracy: 85.0%
Minibatch loss at step 179500: 0.743752
Minibatch accuracy: 85.0%
Minibatch loss at step 180000: 0.137610
Minibatch accuracy: 100.0%
Validation accuracy: 92.2%
Minibatch loss at step 180500: 0.143498
Minibatch accuracy: 100.0%
Minibatch loss at step 181000: 0.203900
Minibatch accuracy: 95.0%
Minibatch loss at step 181500: 0.334231
Minibatch accuracy: 90.0%
Minibatch loss at step 182000: 0.679455
Minibatch accuracy: 85.0%
Validation accuracy: 92.5%
Minibatch loss at step 182500: 0.175268
Minibatch accuracy: 100.0%
Minibatch loss at step 183000: 0.458361
Minibatch accuracy: 90.0%
Minibatch loss at step 183500: 0.638971
Minibatch accuracy: 65.0%
Minibatch loss at step 184000: 0.147521
Minibatch accuracy: 100.0%
Validation accuracy: 92.3%
Minibatch loss at step 184500: 0.185762
Minibatch accuracy: 95.0%
Minibatch loss at step 185000: 0.356948
Minibatch accuracy: 90.0%
Minibatch loss at step 185500: 0.232420
Minibatch accuracy: 95.0%
Minibatch loss at step 186000: 0.176759
Minibatch accuracy: 100.0%
Validation accuracy: 92.3%
Minibatch loss at step 186500: 1.291495
Minibatch accuracy: 85.0%
Minibatch loss at step 187000: 0.480636
Minibatch accuracy: 85.0%
Minibatch loss at step 187500: 0.319844
Minibatch accuracy: 90.0%
Minibatch loss at step 188000: 0.135228
Minibatch accuracy: 95.0%
Validation accuracy: 92.2%
Minibatch loss at step 188500: 0.493966
Minibatch accuracy: 85.0%
Minibatch loss at step 189000: 0.204179
Minibatch accuracy: 95.0%
Minibatch loss at step 189500: 0.465761
Minibatch accuracy: 85.0%
Minibatch loss at step 190000: 0.197215
Minibatch accuracy: 100.0%
Validation accuracy: 92.9%
Minibatch loss at step 190500: 0.137369
Minibatch accuracy: 100.0%
Minibatch loss at step 191000: 0.249051
Minibatch accuracy: 95.0%
Minibatch loss at step 191500: 0.388325
Minibatch accuracy: 90.0%
Minibatch loss at step 192000: 0.122360
Minibatch accuracy: 100.0%
Validation accuracy: 93.0%
Minibatch loss at step 192500: 0.193394
Minibatch accuracy: 95.0%
Minibatch loss at step 193000: 0.381628
Minibatch accuracy: 90.0%
Minibatch loss at step 193500: 0.778883
Minibatch accuracy: 75.0%
Minibatch loss at step 194000: 0.978672
Minibatch accuracy: 75.0%
Validation accuracy: 92.1%
Minibatch loss at step 194500: 0.253928
Minibatch accuracy: 90.0%
Minibatch loss at step 195000: 0.215075
Minibatch accuracy: 95.0%
Minibatch loss at step 195500: 0.322420
Minibatch accuracy: 95.0%
Minibatch loss at step 196000: 0.767570
Minibatch accuracy: 85.0%
Validation accuracy: 92.8%
Minibatch loss at step 196500: 0.721438
Minibatch accuracy: 80.0%
Minibatch loss at step 197000: 0.149140
Minibatch accuracy: 100.0%
Minibatch loss at step 197500: 0.611390
Minibatch accuracy: 85.0%
Minibatch loss at step 198000: 0.605007
Minibatch accuracy: 85.0%
Validation accuracy: 91.8%
Minibatch loss at step 198500: 0.808166
Minibatch accuracy: 75.0%
Minibatch loss at step 199000: 0.831918
Minibatch accuracy: 80.0%
Minibatch loss at step 199500: 0.134410
Minibatch accuracy: 95.0%
Minibatch loss at step 200000: 0.137860
Minibatch accuracy: 100.0%
Validation accuracy: 92.4%
Test accuracy: 92.0%
Model saved to ./tensorflowcheckpoints/svhn

Question 4

Describe how you set up the training and testing data for your model. How does the model perform on a realistic dataset?

Answer:

The testing and training data are the cropped SVHN digits. because of the variety in the shapes and colours of the digits, the accuracy of the trained network is only about 92%, lower than the 99%+ achievable on MNIST.

Question 5

What changes did you have to make, if any, to achieve "good" results? Were there any options you explored that made the results worse?

Answer:

I tried several different optimizers, learning rates, and regularization configurations before settling on the current model. The momentum optimizer did not work well, nor did high learning rates or a high beta for L2 regularization.

This model is quite different from the one used for MNIST sequences. First of all, this model works on the cropped digits only. It is also much deeper, using 4 convolutional layers with max pooling and 90% dropout, and a fully connected layer with 50% dropout.

Question 6

What were your initial and final results with testing on a realistic dataset? Do you believe your model is doing a good enough job at classifying numbers correctly?

Answer:

Many models I tested would not score above 20.4% for the validation set. This final model, too, did not improve for the first 60,000 steps, before it suddenly started learning.

The benchmark for the Google paper was 98%. This is selected as the threshold comparable to human accuracy. My model achieves only 92%, so it would not be practical in a real-world setting.


Step 3: Test a Model on Newly-Captured Images

Take several pictures of numbers that you find around you (at least five), and run them through your classifier on your computer to produce example results. Alternatively (optionally), you can try using OpenCV / SimpleCV / Pygame to capture live images from a webcam and run those through your classifier.

Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [3]:
from IPython.display import Image as iImage, display

picfolder_path = "./Pics"
pic_paths = os.listdir(picfolder_path)
for i in range(5):
    # display pics
    display(iImage(os.path.join(picfolder_path, pic_paths[i]), width=100, height=100))
    
labels = np.array([7,2,5,5,9])
In [23]:
from PIL import Image

resized = []
# resize pics
for i in range(5):
    img = Image.open(os.path.join(picfolder_path, pic_paths[i]))
    resized.extend([img.resize((32, 32),Image.ANTIALIAS)])
    display(resized[i])
    resized[i] = np.array(resized[i])

resized = np.array(resized)
In [28]:
image_size = 32
num_channels = 3
depth1 = 16
depth2 = 32
depth3 = 64
depth4 = 128
kp_fc = 0.5
kp_conv = 0.9
depth5 = 256
num_classes = 10
beta=5e-5

batch_size = 20

patch_size = 4
num_labels = 10

graph = tf.Graph()

with graph.as_default():
    # Input
    tf_test_dataset = tf.placeholder(tf.float32, shape=[5,32,32,3])
    
    # Variables
    weight_layer1 = tf.get_variable("ConvW1", shape=[patch_size, patch_size, 
        num_channels, depth1], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer1 = tf.Variable(tf.constant(1.0, shape=[depth1]))
    
    weight_layer2 = tf.get_variable("ConvW2", shape=[patch_size, patch_size, 
        depth1, depth2], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer2 = tf.Variable(tf.constant(1.0, shape=[depth2]))
    
    weight_layer3 = tf.get_variable("ConvW3", shape=[patch_size, patch_size, 
        depth2, depth3], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer3 = tf.Variable(tf.constant(1.0, shape=[depth3]))
    
    weight_layer4 = tf.get_variable("ConvW4", shape=[patch_size, patch_size, 
        depth3, depth4], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer4 = tf.Variable(tf.constant(1.0, shape=[depth4]))
    
    weight_layer5 = tf.get_variable("FcW1", shape=[2 * 2 * depth4, depth5], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer5 = tf.Variable(tf.constant(1.0, shape=[depth5]))
    
    weight_layer6 = tf.get_variable("FcW2", shape=[depth5, num_classes], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer6 = tf.Variable(tf.constant(1.0, shape=[num_classes]))
    
    # Model
    def model(data, dropout=False):
        # Convolution 1
        conv_1 = tf.nn.conv2d(data, weight_layer1, [1,1,1,1], padding='SAME')
        hidden_1 = tf.nn.relu(conv_1 + bias_layer1)
        pool_1 = tf.nn.max_pool(hidden_1, [1,2,2,1], [1,2,2,1], padding='SAME')
        
        if dropout:
            pool_1 = tf.nn.dropout(pool_1, keep_prob=kp_conv)
        
        # Convolution 2
        conv_2 = tf.nn.conv2d(pool_1, weight_layer2, [1,1,1,1], padding='SAME')
        hidden_2 = tf.nn.relu(conv_2 + bias_layer2)
        pool_2 = tf.nn.max_pool(hidden_2, [1,2,2,1], [1,2,2,1], padding='SAME')
        
        if dropout:
            pool_2 = tf.nn.dropout(pool_2, keep_prob=kp_conv)
        
        # Convolution 3
        conv_3 = tf.nn.conv2d(pool_2, weight_layer3, [1,1,1,1], padding='SAME')
        hidden_3 = tf.nn.relu(conv_3 + bias_layer3)
        pool_3 = tf.nn.max_pool(hidden_3, [1,2,2,1], [1,2,2,1], padding='SAME')
        
        if dropout:
            pool_3 = tf.nn.dropout(pool_3, keep_prob=kp_conv)
            
        # Convolution 4
        conv_4 = tf.nn.conv2d(pool_3, weight_layer4, [1,1,1,1], padding='SAME')
        hidden_4 = tf.nn.relu(conv_4 + bias_layer4)
        pool_4 = tf.nn.max_pool(hidden_4, [1,2,2,1], [1,2,2,1], padding='SAME')
        
        # Reshape
        shape = pool_4.get_shape().as_list()
        reshape = tf.reshape(pool_4, [-1, shape[2] * shape[3] * shape[1]])
        
        if dropout:
            reshape = tf.nn.dropout(reshape, keep_prob=kp_fc)
        
        # Fully Connected
        fc = tf.nn.relu(tf.matmul(reshape, weight_layer5) + bias_layer5)
        
        # Output Classes
        return tf.matmul(fc, weight_layer6) + bias_layer6
        
    # Predictions
    logits = model(tf_test_dataset)
    test_prediction = tf.nn.softmax(logits)
    
    saver = tf.train.Saver()


with tf.Session(graph=graph) as session:
    saver.restore(session, "./tensorflowcheckpoints/svhn")
    test_prediction = session.run(test_prediction, feed_dict={tf_test_dataset : resized})
    
    print(np.argmax(test_prediction, axis=1))
[7 2 5 5 9]

Question 7

Choose five candidate images of numbers you took from around you and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult?

Answer:

Some of the digits are non-standard shapes of various thickness. The image of the 2 is in blue, and the 7 has large curves. However, these challenges did not affect the classifier, and it scored perfectly on all of the digits.

Question 8

Is your model able to perform equally well on captured pictures or a live camera stream when compared to testing on the realistic dataset?

Answer:

Yes it did. I arranged the pictures to be centered on the digits, just like the SVHN dataset, hence it did not prove challenging for the classifier.

Optional: Question 9

If necessary, provide documentation for how an interface was built for your model to load and classify newly-acquired images.

Answer: Leave blank if you did not complete this part.


Step 4: Explore an Improvement for a Model

There are many things you can do once you have the basic classifier in place. One example would be to also localize where the numbers are on the image. The SVHN dataset provides bounding boxes that you can tune to train a localizer. Train a regression loss to the coordinates of the bounding box, and then test it.

Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [25]:
import os
import sys
from six.moves.urllib.request import urlretrieve
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import idx2numpy
import time
from scipy.io import loadmat
In [26]:
import pandas as pd

traincsv = './uncropped_data/train/bbtrain.csv'
testcsv = './uncropped_data/test/bbtest.csv'

df_single_train = pd.read_csv(traincsv)
df_single_test = pd.read_csv(testcsv)
In [27]:
df_single_train.iloc[73254:]
Out[27]:
FileName DigitLabel Left Top Width Height
73254 33402.png 1 35 10 7 25
73255 33402.png 6 44 8 15 25
73256 33402.png 9 62 9 17 25
In [28]:
example_image = './uncropped_data/train/33402.png'

from PIL import Image
img = Image.open(example_image)
imgplot = plt.imshow(img)
plt.show()
In [29]:
for i in range(3):
    left = df_single_train.iloc[73254 + i]['Left']
    top = df_single_train.iloc[73254 + i]['Top']
    width = df_single_train.iloc[73254 + i]['Width']
    height = df_single_train.iloc[73254 + i]['Height']
    plt.imshow(Image.open(example_image))
    plt.plot([left, left], [top, top + height], 'c')
    plt.plot([left, left+width], [top, top], 'c')
    plt.plot([left+width, left+width], [top, top+height], 'c')
    plt.plot([left, left+width], [top+height, top+height], 'c')

plt.show()
In [30]:
lefts_train = df_single_train.groupby('FileName')['Left'].min()
tops_train = df_single_train.groupby('FileName')['Top'].min()
heights_train = df_single_train.groupby('FileName')['Height'].max()

first_left_train = df_single_train.groupby('FileName').min()['Left']
last_left_train = df_single_train.groupby('FileName').max()['Left']
last_width_train = df_single_train.groupby('FileName').last()['Width']
widths_train = last_left_train + last_width_train - first_left_train 

lefts_test = df_single_test.groupby('FileName')['Left'].min()
tops_test = df_single_test.groupby('FileName')['Top'].min()
heights_test = df_single_test.groupby('FileName')['Height'].max()

first_left_test = df_single_test.groupby('FileName').min()['Left']
last_left_test = df_single_test.groupby('FileName').max()['Left']
last_width_test = df_single_test.groupby('FileName').last()['Width']
widths_test = last_left_test + last_width_test - first_left_test 

Next Steps

Now that we have a series for each of the left, top, width, and height of the bounding box for each image, we want to arrange them into a DataFrame to prepare for cropping

In [31]:
df_crop_train = lefts_train.to_frame().join(tops_train.to_frame()).join(widths_train.to_frame()).join(heights_train.to_frame())
df_crop_test = lefts_test.to_frame().join(tops_test.to_frame()).join(widths_test.to_frame()).join(heights_test.to_frame())
In [32]:
df_crop_train.columns = ['Left','Top','Width','Height']
df_crop_test.columns = ['Left','Top','Width','Height']
df_crop_train.head()
Out[32]:
Left Top Width Height
FileName
1.png 246 77 173 219
10.png 25 4 23 27
100.png 18 0 24 22
1000.png 17 1 10 18
10000.png 45 20 43 29
In [33]:
head_images = [1,10,100,1000,10000]

for i in head_images:
    image_name = '{}.png'.format(i)
    example_image = './uncropped_data/train/' + image_name
    img = Image.open(example_image)
    imgplot = plt.imshow(img)
    
    left = df_crop_train.loc[image_name]['Left']    
    top = df_crop_train.loc[image_name]['Top']    
    width = df_crop_train.loc[image_name]['Width']    
    height = df_crop_train.loc[image_name]['Height']
    
    plt.plot([left, left], [top, top + height], 'c')
    plt.plot([left, left+width], [top, top], 'c')
    plt.plot([left+width, left+width], [top, top+height], 'c')
    plt.plot([left, left+width], [top+height, top+height], 'c')
    
    plt.show()

Before cropping, it is suggested that we expand the bounding boxes by 30% in each direction. This will involve subtracting 30% of the width from the left column, subtracting 30% of the height from the top, etc. We must also ensure that this expansion does not go past the edge of the image, so we need to get the true height and width for each image.

In [34]:
df_crop_train['ImageX'] = np.nan
df_crop_train['ImageY'] = np.nan

df_crop_test['ImageX'] = np.nan
df_crop_test['ImageY'] = np.nan
In [35]:
def set_image_dims(df_crop, test=False):
    max_num = df_crop.shape[0]
    for i in range(1, max_num+1):
        folder = str()
        if test:
            folder = './uncropped_data/test/'
        else:
            folder = './uncropped_data/train/'
        
        image_name = '{}.png'.format(i)
        example_image = folder + image_name
        
        img = Image.open(example_image)
        width, height = img.size
        df_crop.set_value(image_name, ['ImageX'], width)
        df_crop = df_crop.set_value(image_name, ['ImageY'], height)
In [36]:
set_image_dims(df_crop_train, test=False)
set_image_dims(df_crop_test, test=True)
In [37]:
df_crop_train.head()
df_crop_test.head()

print(df_crop_train.isnull().values.any())
print(df_crop_test.isnull().values.any())
False
False

Now that we have the image dimensions, we can apply a function to the columns of the DataFrame that expands the bounding boxes.

In [38]:
def fixLeft(row):
    return max( row['Left'] - 0.3 * row['Width'] , 0 )

df_crop_train['cropLeft'] = df_crop_train.apply(fixLeft, axis=1)
df_crop_test['cropLeft'] = df_crop_test.apply(fixLeft, axis=1)

def fixTop(row):
    return max( row['Top'] - 0.3 * row['Height'] , 0 )

df_crop_train['cropTop'] = df_crop_train.apply(fixTop, axis=1)
df_crop_test['cropTop'] = df_crop_test.apply(fixTop, axis=1)
In [39]:
def fixWidth(row):
    return( min( 1.6 * row['Width'], row['ImageX'] - row['cropLeft']))

df_crop_train['cropWidth'] = df_crop_train.apply(fixWidth, axis=1)
df_crop_test['cropWidth'] = df_crop_test.apply(fixWidth, axis=1)

def fixHeight(row):
    if row['Top'] + 1.6 * row['Height'] > row['ImageY']:
        return row['ImageY'] - row['cropTop']
    else:
        return 1.6 * row['Height']

df_crop_train['cropHeight'] = df_crop_train.apply(fixHeight, axis=1)
df_crop_test['cropHeight'] = df_crop_test.apply(fixHeight, axis=1)
In [40]:
for i in head_images:
    image_name = '{}.png'.format(i)
    example_image = './uncropped_data/train/' + image_name
    img = Image.open(example_image)
    imgplot = plt.imshow(img)
    
    left = df_crop_train.loc[image_name]['cropLeft']    
    top = df_crop_train.loc[image_name]['cropTop']    
    width = df_crop_train.loc[image_name]['cropWidth']    
    height = df_crop_train.loc[image_name]['cropHeight']
    
    
    plt.plot([left, left], [top, top + height], 'c')
    plt.plot([left, left+width], [top, top], 'c')
    plt.plot([left+width, left+width], [top, top+height], 'c')
    plt.plot([left, left+width], [top+height, top+height], 'c')
    
    plt.show()
In [41]:
df_crop_train.head()
Out[41]:
Left Top Width Height ImageX ImageY cropLeft cropTop cropWidth cropHeight
FileName
1.png 246 77 173 219 741.0 350.0 194.1 11.3 276.8 338.7
10.png 25 4 23 27 74.0 37.0 18.1 0.0 36.8 37.0
100.png 18 0 24 22 67.0 27.0 10.8 0.0 38.4 27.0
1000.png 17 1 10 18 44.0 21.0 14.0 0.0 16.0 21.0
10000.png 45 20 43 29 137.0 62.0 32.1 11.3 68.8 50.7
In [42]:
df_crop_train = df_crop_train.drop(['Left','Top','Width','Height'], axis=1)
df_crop_test = df_crop_test.drop(['Left','Top','Width','Height'], axis=1)

Now it is time to crop and resize all of the images, while preserving the location of the bounding boxes. We have four DataFrames:

df_single_train
df_single_test
df_crop_train
df_crop_test

Once we crop the images, the original two DataFrames' information becomes incorrect. So from the Top and Left columns, we need to subtract the values of the cropTop and cropLeft. But first let's crop:

In [21]:
def crop_images(df_crop, test=False):
    max_num = df_crop.shape[0]
    for i in range(1, max_num + 1):
        image_name = '{}.png'.format(i)
        folder = './uncropped_data/'
        if test:
            folder = folder + 'test/'
        else:
            folder = folder + 'train/'
        example_image = folder + image_name
        img = Image.open(example_image)
        
        # locate the top, left, width, and height of the crop
        left = df_crop.loc[image_name]['cropLeft']    
        top = df_crop.loc[image_name]['cropTop']    
        width = df_crop.loc[image_name]['cropWidth']    
        height = df_crop.loc[image_name]['cropHeight']
        
        # create a new image during a crop, and save to the folder
        newfolder = './uncropped_data/'
        if test:
            newfolder = newfolder + 'cropTest/'
        else:
            newfolder = newfolder + 'cropTrain/'
        
        new_image_loc = newfolder + image_name
        
        img2 = img.crop(( left, top, width+left, height+top ))
        img2 = img2.resize((64,64))
        img2.save(new_image_loc)
        

#crop_images(df_crop_train, test=False)
#crop_images(df_crop_test, test=True)
In [43]:
def display_image(image_name, crop=False, test=False):
    folder = './uncropped_data/'
    
    if crop and test:
        folder = folder + 'cropTest/'
    elif crop and not test:
        folder = folder + 'cropTrain/'
    elif not crop and test:
        folder = folder + 'test/'
    else:
        folder = folder + 'train/'
    
    example_image = folder + image_name
    img = Image.open(example_image)
    plt.imshow(img)
    plt.show()
In [44]:
for image in head_images:
    image = str(image)
    display_image(image+'.png', crop=True)

The cropping is finished! Now we need to get the values of the bounding boxes. This will be easiest if we just add the values for the transformed boxes to the original DataFrames

In [45]:
test_df_single = df_single_train.iloc[:20]
test_df_crop = df_crop_train.loc[['1.png','2.png','3.png','4.png','5.png','6.png','7.png','8.png','9.png']]
print(test_df_single)
test_df_crop
   FileName  DigitLabel  Left  Top  Width  Height
0     1.png           1   246   77     81     219
1     1.png           9   323   81     96     219
2     2.png           2    77   29     23      32
3     2.png           3    98   25     26      32
4     3.png           2    17    5      8      15
5     3.png           5    25    5      9      15
6     4.png           9    57   13     15      34
7     4.png           3    72   13     13      34
8     5.png           3    52    7     21      46
9     5.png           1    74   10     15      46
10    6.png           3    28    6     10      21
11    6.png           3    38    8     11      21
12    7.png           2    35   10     13      32
13    7.png           8    47   11     13      32
14    8.png           7    17    4      7      15
15    8.png           4    25    4      6      15
16    8.png           4    31    3      7      15
17    9.png           1    19    4     14      24
18    9.png           2    29    4     13      24
19    9.png           8    38    5     17      24
Out[45]:
ImageX ImageY cropLeft cropTop cropWidth cropHeight
FileName
1.png 741.0 350.0 194.1 11.3 276.8 338.7
2.png 199.0 83.0 62.9 15.4 75.2 51.2
3.png 52.0 23.0 11.9 0.5 27.2 22.5
4.png 161.0 79.0 48.6 2.8 44.8 54.4
5.png 140.0 68.0 40.9 0.0 59.2 68.0
6.png 74.0 35.0 21.7 0.0 33.6 35.0
7.png 99.0 54.0 27.5 0.4 40.0 53.6
8.png 54.0 22.0 10.7 0.0 33.6 22.0
9.png 79.0 34.0 8.2 0.0 57.6 34.0
In [46]:
# These functions calculate the new locations for the bounding boxes
# based on a crop a translation to this image size:

image_size = 64.0

def get_new_top(oldTop, cropTop, cropHeight):
    newTop = oldTop - cropTop
    newTop *= image_size/cropHeight
    return newTop

def get_new_left(oldLeft, cropLeft, cropWidth):
    newLeft = oldLeft - cropLeft
    newLeft *= image_size/cropWidth
    return newLeft

def get_new_width(oldWidth, cropWidth):
    newWidth = image_size*oldWidth/cropWidth
    return newWidth

def get_new_height(oldHeight, cropHeight):
    newHeight = image_size*oldHeight/cropHeight
    return newHeight
In [47]:
def cropped_bbs(df_single, df_crop):
    max_num = df_crop.shape[0]
    
    # add 20 new columns to our crop DataFrame
    df_crop['d1_bb_top'] = np.nan
    df_crop['d1_bb_left'] = np.nan
    df_crop['d1_bb_width'] = np.nan
    df_crop['d1_bb_height'] = np.nan
    
    df_crop['d2_bb_top'] = np.nan
    df_crop['d2_bb_left'] = np.nan
    df_crop['d2_bb_width'] = np.nan
    df_crop['d2_bb_height'] = np.nan
    
    df_crop['d3_bb_top'] = np.nan
    df_crop['d3_bb_left'] = np.nan
    df_crop['d3_bb_width'] = np.nan
    df_crop['d3_bb_height'] = np.nan
    
    df_crop['d4_bb_top'] = np.nan
    df_crop['d4_bb_left'] = np.nan
    df_crop['d4_bb_width'] = np.nan
    df_crop['d4_bb_height'] = np.nan
    
    df_crop['d5_bb_top'] = np.nan
    df_crop['d5_bb_left'] = np.nan
    df_crop['d5_bb_width'] = np.nan
    df_crop['d5_bb_height'] = np.nan
    
    for i in range(1, max_num+1):
        image = str(i) + '.png'
        
        cropTop = df_crop.loc[image]['cropTop']
        cropLeft = df_crop.loc[image]['cropLeft']
        cropHeight = df_crop.loc[image]['cropHeight']
        cropWidth = df_crop.loc[image]['cropWidth']
        
        df_slice = df_single[df_single['FileName'] == image]
        num_digits = df_slice.shape[0]
        
        for digit_place in range(1,6):
            d_holder = 'd' + str(digit_place) + '_bb_'
            
            if digit_place <= num_digits:
                
                oldTop = df_slice.iloc[digit_place-1]['Top']
                oldLeft = df_slice.iloc[digit_place-1]['Left']
                oldWidth = df_slice.iloc[digit_place-1]['Width']
                oldHeight = df_slice.iloc[digit_place-1]['Height']
                
                newTop = get_new_top(oldTop, cropTop, cropHeight)
                newLeft = get_new_left(oldLeft, cropLeft, cropWidth)
                newWidth = get_new_width(oldWidth, cropWidth)
                newHeight = get_new_height(oldHeight, cropHeight)
                
                df_crop.set_value(image, [d_holder+'top'], newTop)
                df_crop.set_value(image, [d_holder+'left'], newLeft)
                df_crop.set_value(image, [d_holder+'width'], newWidth)
                df_crop.set_value(image, [d_holder+'height'], newHeight)
            
            else:
                # For blank digits, we will put the bounding box 
                # outside the range of the image
                df_crop.set_value(image, [d_holder+'top'], 0)
                df_crop.set_value(image, [d_holder+'left'], 65)
                df_crop.set_value(image, [d_holder+'width'], 10)
                df_crop.set_value(image, [d_holder+'height'], 10)
In [48]:
cropped_bbs(df_single_train, df_crop_train)
cropped_bbs(df_single_test, df_crop_test)
In [49]:
df_crop_train.head()
Out[49]:
ImageX ImageY cropLeft cropTop cropWidth cropHeight d1_bb_top d1_bb_left d1_bb_width d1_bb_height ... d3_bb_width d3_bb_height d4_bb_top d4_bb_left d4_bb_width d4_bb_height d5_bb_top d5_bb_left d5_bb_width d5_bb_height
FileName
1.png 741.0 350.0 194.1 11.3 276.8 338.7 12.414526 12.0 18.728324 41.381754 ... 10.000000 10.000000 0.0 65.0 10.0 10.0 0.0 65.0 10.0 10.0
10.png 74.0 37.0 18.1 0.0 36.8 37.0 10.378378 12.0 15.652174 46.702703 ... 10.000000 10.000000 0.0 65.0 10.0 10.0 0.0 65.0 10.0 10.0
100.png 67.0 27.0 10.8 0.0 38.4 27.0 2.370370 12.0 8.333333 52.148148 ... 16.666667 52.148148 0.0 65.0 10.0 10.0 0.0 65.0 10.0 10.0
1000.png 44.0 21.0 14.0 0.0 16.0 21.0 3.047619 12.0 16.000000 54.857143 ... 10.000000 10.000000 0.0 65.0 10.0 10.0 0.0 65.0 10.0 10.0
10000.png 137.0 62.0 32.1 11.3 68.8 50.7 10.982249 12.0 13.953488 36.607495 ... 16.744186 36.607495 0.0 65.0 10.0 10.0 0.0 65.0 10.0 10.0

5 rows × 26 columns

In [50]:
def plot_crop_bbs(image_name, test=False):
    folder = './uncropped_data/'
    if test:
        folder += 'cropTest/'
    else:
        folder += 'cropTrain/'
    example_image = folder + image_name
    img = Image.open(example_image)
    plt.imshow(img)
    
    for digit_place in range(1,6):
        d_holder = 'd' + str(digit_place) + '_bb_'
        
        left = df_crop_train.loc[image_name][d_holder+'left']    
        top = df_crop_train.loc[image_name][d_holder+'top']    
        width = df_crop_train.loc[image_name][d_holder+'width']    
        height = df_crop_train.loc[image_name][d_holder+'height']
        
        plt.plot([left, left], [top, top + height], 'c')
        plt.plot([left, left+width], [top, top], 'c')
        plt.plot([left+width, left+width], [top, top+height], 'c')
        plt.plot([left, left+width], [top+height, top+height], 'c')
    
    plt.show()
In [51]:
plot_crop_bbs('20000.png')
display_image('20000.png')
display_image('20000.png', crop=True)

We need to be able to plot the bounding boxes on the cropped images. Let's modify df_crop_bb_train to hold the bounding boxes for each image.

In [52]:
newcols = ['d1_bb_top', 'd1_bb_left', 'd1_bb_width', 'd1_bb_height',
          'd2_bb_top', 'd2_bb_left', 'd2_bb_width', 'd2_bb_height',
          'd3_bb_top', 'd3_bb_left', 'd3_bb_width', 'd3_bb_height',
          'd4_bb_top', 'd4_bb_left', 'd4_bb_width', 'd4_bb_height',
          'd5_bb_top', 'd5_bb_left', 'd5_bb_width', 'd5_bb_height']

for col in newcols:
    if df_crop_train[col].isnull().values.any():
        print('train',col)
    if df_crop_test[col].isnull().values.any():
        print('test',col)
In [53]:
display_image('10036.png', test=True, crop=True)
In [54]:
plot_crop_bbs('10036.png',test=True)
In [55]:
def get_bb_array(image_name, test=False):
    folder = './uncropped_data/'
    if test:
        folder += 'cropTest/'
    else:
        folder += 'cropTrain/'
    example_image = folder + image_name
    img = Image.open(example_image)
    
    bbs = []
    
    for digit_place in range(1,6):
        d_holder = 'd' + str(digit_place) + '_bb_'
        
        left = df_crop_train.loc[image_name][d_holder+'left']   
        top = df_crop_train.loc[image_name][d_holder+'top']    
        width = df_crop_train.loc[image_name][d_holder+'width']    
        height = df_crop_train.loc[image_name][d_holder+'height']
        
        bbs.append(left)
        bbs.append(top)
        bbs.append(width)
        bbs.append(height)
    
    img.close()
    return np.array(bbs).reshape((-1,4))
In [56]:
get_bb_array('10036.png',test=True)
Out[56]:
array([[ 12.        ,   7.68      ,  18.18181818,  40.96      ],
       [ 33.81818182,   7.68      ,  18.18181818,  40.96      ],
       [ 65.        ,   0.        ,  10.        ,  10.        ],
       [ 65.        ,   0.        ,  10.        ,  10.        ],
       [ 65.        ,   0.        ,  10.        ,  10.        ]])

If seems the pipeline for processing images is complete. Next we need a metric. One way to evaluate the bounding boxes is by using the Intersect of Union. It is calculated as the area of the overlapping regions of the predicted and actual bounding boxes divided by the combined area.

In [57]:
def batch_iou(a, b, epsilon=1e-5):
    """ 
    a and b are batches of bounding boxes, [left, top, width, height]
    
    I slightly modified the code from this tutorial:
    http://ronny.rest/tutorials/lesson/intersect_of_union/
    """
    
    a_rights = a[:, 0] + a[:, 2]
    a_bottoms = a[:, 1] + a[:, 3]
    
    b_rights = b[:, 0] + b[:, 2]
    b_bottoms = b[:, 1] + b[:, 3]
    
    # COORDINATES OF THE INTERSECTION BOXES
    x1 = np.array([a[:, 0], b[:, 0]]).max(axis=0)
    y1 = np.array([a[:, 1], b[:, 1]]).max(axis=0)
    x2 = np.array([a_rights, b_rights]).min(axis=0)
    y2 = np.array([a_bottoms, b_bottoms]).min(axis=0)

    # AREAS OF OVERLAP - Area where the boxes intersect
    width = (x2 - x1)
    height = (y2 - y1)

    # handle case where there is NO overlap
    width[width < 0] = 0
    height[height < 0] = 0

    area_overlap = width * height

    # COMBINED AREAS
    area_a = (a_rights - a[:, 0]) * (a_bottoms - a[:, 1])
    area_b = (b_rights - b[:, 0]) * (b_bottoms - b[:, 1])
    area_combined = area_a + area_b - area_overlap

    # RATIO OF AREA OF OVERLAP OVER COMBINED AREA
    iou = area_overlap / (area_combined + epsilon)
    return iou
In [58]:
batch_a = get_bb_array('10036.png', test=True)
pred_batch_a = np.array([1+i for i in batch_a])
In [59]:
batch_iou(batch_a, pred_batch_a)
Out[59]:
array([ 0.79556984,  0.79556984,  0.57857139,  0.57857139,  0.57857139])
In [60]:
def metric(true_boxes, pred_boxes):
    ious = batch_iou(np.reshape(true_boxes, (-1, 4)), 
                     np.reshape(pred_boxes, (-1, 4)))
    return np.mean(ious)
In [89]:
image_size = 64
num_channels = 3

def make_datasets(valid_size = 5000):
    folder = './uncropped_data/'
    train_folder = folder + 'cropTrain/'
    test_folder = folder + 'cropTest/'
    
    train_size = len(os.listdir(train_folder)) - valid_size
    test_size = len(os.listdir(test_folder))
    
    train_dataset = np.ndarray(shape=(train_size, image_size, image_size, num_channels), dtype=np.float32)
    valid_dataset = np.ndarray(shape=(valid_size, image_size, image_size, num_channels), dtype=np.float32)
    test_dataset = np.ndarray(shape=(test_size, image_size, image_size, num_channels), dtype=np.float32)
        
    for i in range(1,train_size+1):
        image = str(i) + '.png'
        img = Image.open( train_folder + image )
        data = np.asarray( img, dtype=np.float32 )
        img.close()
        train_dataset[i-1,:,:,:] = (data - 128.0) / 128.0
        
    for i in range(1,valid_size+1):
        image = str(i+train_size) + '.png'
        img = Image.open( train_folder + image)
        data = np.asarray( img, dtype=np.float32)
        img.close()
        valid_dataset[i-1,:,:,:] = (data - 128.0) / 128.0
    
    for i in range(1,test_size+1):
        image = str(i) + '.png'
        img = Image.open( test_folder + image)
        data = np.asarray( img, dtype=np.float32)
        img.close()
        test_dataset[i-1,:,:,:] = (data - 128.0) / 128.0
        
    
    return train_dataset, valid_dataset, test_dataset

train_dataset, valid_dataset, test_dataset = make_datasets()
In [90]:
print(train_dataset.shape)
print(valid_dataset.shape)
print(test_dataset.shape)
(28402, 64, 64, 3)
(5000, 64, 64, 3)
(13068, 64, 64, 3)
In [91]:
def make_labels(valid_size=5000):
    folder = './uncropped_data/'
    train_folder = folder + 'cropTrain/'
    test_folder = folder + 'cropTest/'
    
    train_size = len(os.listdir(train_folder)) - valid_size
    test_size = len(os.listdir(test_folder))
    
    train_labels = np.ndarray(shape=(train_size, 20))
    valid_labels = np.ndarray(shape=(valid_size, 20))
    test_labels = np.ndarray(shape=(test_size, 20))
    
    for i in range(1,train_size+1):
        image = str(i) + '.png'
        train_labels[i-1,:] = np.reshape(get_bb_array(image, 
                                    test=False),newshape=(20))/64.0
        
    for i in range(1, valid_size+1):
        image = str(i+train_size) + '.png'
        valid_labels[i-1,:] = np.reshape(get_bb_array(image,
                                    test=False),newshape=(20))/64.0
    
    for i in range(1,test_size+1):
        image= str(i) + '.png'
        test_labels[i-1,:] = np.reshape(get_bb_array(image,
                                    test=True),newshape=(20))/64.0
        
    return train_labels, valid_labels, test_labels

train_labels, valid_labels, test_labels = make_labels()
In [92]:
print(train_labels.shape)
print(valid_labels.shape)
print(test_labels.shape)
(28402, 20)
(5000, 20)
(13068, 20)
In [93]:
def pickle_it():
    pickle.dump(train_dataset[:20000], open("train_dataset1.pkl", "wb"))
    pickle.dump(train_dataset[20000:], open("train_dataset2.pkl", "wb"))
    pickle.dump(valid_dataset, open("valid_dataset.pkl", "wb"))
    pickle.dump(test_dataset, open("test_dataset.pkl", "wb"))

    pickle.dump(train_labels, open("train_labels.pkl", "wb"))
    pickle.dump(valid_labels, open("valid_labels.pkl", "wb"))
    pickle.dump(test_labels, open("test_labels.pkl", "wb"))

pickle_it()
In [94]:
import pickle

def unpickle_it():
    train_dataset1 = pickle.load(open("train_dataset1.pkl", "rb"))
    #train_dataset2 = pickle.load(open("train_dataset2.pkl", "rb"))
    #train_dataset = np.concatenate([train_dataset1, train_dataset2], axis=0)
    valid_dataset = pickle.load(open("valid_dataset.pkl", "rb"))
    test_dataset = pickle.load(open("test_dataset.pkl", "rb"))
    
    train_labels = pickle.load(open("train_labels.pkl", "rb"))
    valid_labels = pickle.load(open("valid_labels.pkl", "rb"))
    test_labels = pickle.load(open("test_labels.pkl", "rb"))
    
    return train_dataset1, valid_dataset, test_dataset, train_labels, valid_labels, test_labels

train_dataset, valid_dataset, test_dataset, train_labels, valid_labels, test_labels = unpickle_it()
In [95]:
train_dataset = train_dataset.astype(np.float32)
valid_dataset = valid_dataset.astype(np.float32)
test_dataset = test_dataset.astype(np.float32)

train_labels = train_labels.astype(np.float32)
valid_labels = valid_labels.astype(np.float32)
test_labels = test_labels.astype(np.float32)

print(train_dataset.shape, train_dataset.dtype)
print(valid_dataset.shape, valid_dataset.dtype)
print(test_dataset.shape, test_dataset.dtype)

print(train_labels.shape, train_labels.dtype)
print(valid_labels.shape, valid_labels.dtype)
print(test_labels.shape, test_labels.dtype)
(20000, 64, 64, 3) float32
(5000, 64, 64, 3) float32
(13068, 64, 64, 3) float32
(28402, 20) float32
(5000, 20) float32
(13068, 20) float32
In [105]:
# trim test set and labels because of memory error
test_dataset = test_dataset[:5000]
test_labels = test_labels[:5000]

train_labels = train_labels[:20000]
In [115]:
depth1 = 16
depth2 = 32
depth3 = 64
depth4 = 128

num_outputs = 20
beta=5e-5
patch_size = 4
image_size = 64
num_channels = 3

graph = tf.Graph()

with graph.as_default():
    # Input
    tf_train_dataset = tf.placeholder(tf.float32, 
                shape=(None, image_size, image_size, num_channels))
    tf_train_labels = tf.placeholder(tf.float32,
                shape=(None, 20))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # Variables
    weight_layer1 = tf.get_variable("ConvW1", shape=[patch_size, patch_size, 
        num_channels, depth1], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer1 = tf.Variable(tf.constant(1.0, shape=[depth1]))
    
    weight_layer2 = tf.get_variable("ConvW2", shape=[patch_size, patch_size, 
        depth1, depth2], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer2 = tf.Variable(tf.constant(1.0, shape=[depth2]))
    
    weight_layer3 = tf.get_variable("ConvW3", shape=[patch_size, patch_size, 
        depth2, depth3], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer3 = tf.Variable(tf.constant(1.0, shape=[depth3]))
    
    weight_layer4 = tf.get_variable("ConvW4", shape=[patch_size, patch_size, 
        depth3, depth4], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer4 = tf.Variable(tf.constant(1.0, shape=[depth4]))
    
    weight_layer5 = tf.get_variable("FcW1", shape=[4 * 4 * depth4, num_outputs], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer5 = tf.Variable(tf.constant(1.0, shape=[num_outputs]))
    
    # Model
    def model(data, ):
        conv1 = tf.nn.conv2d(data, weight_layer1, [1,2,2,1], padding='SAME')
        hidden1 = tf.nn.relu(conv1 + bias_layer1)
        
        conv2 = tf.nn.conv2d(hidden1, weight_layer2, [1,2,2,1], padding='SAME')
        hidden2 = tf.nn.relu(conv2 + bias_layer2)
        
        conv3 = tf.nn.conv2d(hidden2, weight_layer3, [1,2,2,1], padding='SAME')
        hidden3 = tf.nn.relu(conv3 + bias_layer3)
        
        conv4 = tf.nn.conv2d(hidden3, weight_layer4, [1,2,2,1], padding='SAME')
        hidden4 = tf.nn.relu(conv4 + bias_layer4)
        
        shape = hidden4.get_shape().as_list()
        reshape = tf.reshape(hidden4, [-1, shape[1]*shape[2]*shape[3]])
        
        return tf.nn.relu(tf.matmul(reshape, weight_layer5) + bias_layer5)
    
    # Training computation
    logits = model(tf_train_dataset)
    loss = tf.reduce_mean(tf.squared_difference(logits, tf_train_labels))
    loss += beta*(tf.nn.l2_loss(weight_layer1)+tf.nn.l2_loss(weight_layer2)+tf.nn.l2_loss(weight_layer3)+tf.nn.l2_loss(weight_layer4)+tf.nn.l2_loss(weight_layer5))
    
    # Optimizer
    global_step = tf.Variable(0)
    learning_rate = tf.train.exponential_decay(0.05, global_step, 
            decay_steps=10000, decay_rate=0.96, staircase=True)
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)
    
    # Predictions for the training, validation, and test data
    train_prediction = logits
    valid_prediction = model(tf_valid_dataset)
    test_prediction = model(tf_test_dataset)
    
    saver = tf.train.Saver()
In [116]:
num_steps = 300001
batch_size = 6

with tf.Session(graph=graph) as session:
    tf.global_variables_initializer().run()
    #saver.restore(session, tf.train.latest_checkpoint('./tensorflowcheckpoints/'))
    print('Initialized')
    
    for step in range(num_steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
        _, l, predictions = session.run(
          [optimizer, loss, train_prediction], feed_dict=feed_dict)
        if (step % 2500 == 0):
            print('Minibatch loss at step %d: %f' % (step, l))
            print('Minibatch accuracy: %.1f%%' % metric(predictions, batch_labels))
        if (step % 10000 == 0):
            print('Validation accuracy: %.1f%%' % metric(
                valid_prediction.eval(), valid_labels))
    print('Test accuracy: %.1f%%' % metric(test_prediction.eval(), test_labels))
    save_path = saver.save(session, "./tensorflowcheckpoints/bbs4")
    print('Model saved to {}'.format(save_path))
Initialized
Minibatch loss at step 0: 1.719239
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 2500: 0.548391
Minibatch accuracy: 0.0%
Minibatch loss at step 5000: 1.719223
Minibatch accuracy: 0.0%
Minibatch loss at step 7500: 16.522694
Minibatch accuracy: 0.0%
Minibatch loss at step 10000: 15.912806
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 12500: 15.209406
Minibatch accuracy: 0.0%
Minibatch loss at step 15000: 15.644851
Minibatch accuracy: 0.0%
Minibatch loss at step 17500: 15.188120
Minibatch accuracy: 0.0%
Minibatch loss at step 20000: 14.483620
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 22500: 13.665503
Minibatch accuracy: 0.0%
Minibatch loss at step 25000: 12.773808
Minibatch accuracy: 0.0%
Minibatch loss at step 27500: 12.502512
Minibatch accuracy: 0.0%
Minibatch loss at step 30000: 14.814592
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 32500: 17.415756
Minibatch accuracy: 0.0%
Minibatch loss at step 35000: 15.365860
Minibatch accuracy: 0.0%
Minibatch loss at step 37500: 14.048973
Minibatch accuracy: 0.0%
Minibatch loss at step 40000: 10.967856
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 42500: 15.951011
Minibatch accuracy: 0.0%
Minibatch loss at step 45000: 14.306408
Minibatch accuracy: 0.0%
Minibatch loss at step 47500: 14.087087
Minibatch accuracy: 0.0%
Minibatch loss at step 50000: 19.918148
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 52500: 19.770412
Minibatch accuracy: 0.0%
Minibatch loss at step 55000: 19.586573
Minibatch accuracy: 0.0%
Minibatch loss at step 57500: 22.398403
Minibatch accuracy: 0.0%
Minibatch loss at step 60000: 24.281715
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 62500: 24.095993
Minibatch accuracy: 0.0%
Minibatch loss at step 65000: 29.204632
Minibatch accuracy: 0.0%
Minibatch loss at step 67500: 30.239912
Minibatch accuracy: 0.0%
Minibatch loss at step 70000: 30.091572
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 72500: 30.146229
Minibatch accuracy: 0.0%
Minibatch loss at step 75000: 29.355564
Minibatch accuracy: 0.0%
Minibatch loss at step 77500: 29.652508
Minibatch accuracy: 0.0%
Minibatch loss at step 80000: 29.466768
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 82500: 28.399220
Minibatch accuracy: 0.0%
Minibatch loss at step 85000: 27.014208
Minibatch accuracy: 0.0%
Minibatch loss at step 87500: 24.943045
Minibatch accuracy: 0.0%
Minibatch loss at step 90000: 22.498581
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 92500: 20.273788
Minibatch accuracy: 0.0%
Minibatch loss at step 95000: 17.687866
Minibatch accuracy: 0.0%
Minibatch loss at step 97500: 14.084508
Minibatch accuracy: 0.0%
Minibatch loss at step 100000: 11.621584
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 102500: 9.456659
Minibatch accuracy: 0.0%
Minibatch loss at step 105000: 8.024334
Minibatch accuracy: 0.0%
Minibatch loss at step 107500: 6.914643
Minibatch accuracy: 0.0%
Minibatch loss at step 110000: 6.038265
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 112500: 5.137299
Minibatch accuracy: 0.0%
Minibatch loss at step 115000: 5.572999
Minibatch accuracy: 0.0%
Minibatch loss at step 117500: 5.941804
Minibatch accuracy: 0.0%
Minibatch loss at step 120000: 5.694638
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 122500: 6.001942
Minibatch accuracy: 0.0%
Minibatch loss at step 125000: 6.139653
Minibatch accuracy: 0.0%
Minibatch loss at step 127500: 6.402538
Minibatch accuracy: 0.0%
Minibatch loss at step 130000: 7.399650
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 132500: 8.169127
Minibatch accuracy: 0.0%
Minibatch loss at step 135000: 8.249006
Minibatch accuracy: 0.0%
Minibatch loss at step 137500: 10.715422
Minibatch accuracy: 0.0%
Minibatch loss at step 140000: 10.254310
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 142500: 11.657508
Minibatch accuracy: 0.0%
Minibatch loss at step 145000: 11.167723
Minibatch accuracy: 0.0%
Minibatch loss at step 147500: 13.536378
Minibatch accuracy: 0.0%
Minibatch loss at step 150000: 13.018026
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 152500: 12.464867
Minibatch accuracy: 0.0%
Minibatch loss at step 155000: 12.373894
Minibatch accuracy: 0.0%
Minibatch loss at step 157500: 11.308949
Minibatch accuracy: 0.0%
Minibatch loss at step 160000: 10.092434
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 162500: 9.758890
Minibatch accuracy: 0.0%
Minibatch loss at step 165000: 9.030772
Minibatch accuracy: 0.0%
Minibatch loss at step 167500: 8.571747
Minibatch accuracy: 0.0%
Minibatch loss at step 170000: 8.298415
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 172500: 8.066391
Minibatch accuracy: 0.0%
Minibatch loss at step 175000: 7.698763
Minibatch accuracy: 0.0%
Minibatch loss at step 177500: 8.274456
Minibatch accuracy: 0.0%
Minibatch loss at step 180000: 8.286598
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 182500: 8.812264
Minibatch accuracy: 0.0%
Minibatch loss at step 185000: 8.619462
Minibatch accuracy: 0.0%
Minibatch loss at step 187500: 8.696571
Minibatch accuracy: 0.0%
Minibatch loss at step 190000: 8.465579
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 192500: 9.150507
Minibatch accuracy: 0.0%
Minibatch loss at step 195000: 24740909056.000000
Minibatch accuracy: 0.0%
Minibatch loss at step 197500: 10.063444
Minibatch accuracy: 0.0%
Minibatch loss at step 200000: 10.006704
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 202500: 9.995254
Minibatch accuracy: 0.0%
Minibatch loss at step 205000: 10.141016
Minibatch accuracy: 0.0%
Minibatch loss at step 207500: 9.933149
Minibatch accuracy: 0.0%
Minibatch loss at step 210000: 9.686116
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 212500: 9.396648
Minibatch accuracy: 0.0%
Minibatch loss at step 215000: 9.085619
Minibatch accuracy: 0.0%
Minibatch loss at step 217500: 9.142465
Minibatch accuracy: 0.0%
Minibatch loss at step 220000: 8.783246
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 222500: 8.332757
Minibatch accuracy: 0.0%
Minibatch loss at step 225000: 7.694438
Minibatch accuracy: 0.0%
Minibatch loss at step 227500: 6.148409
Minibatch accuracy: 0.0%
Minibatch loss at step 230000: 3.488290
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 232500: 1.158673
Minibatch accuracy: 0.0%
Minibatch loss at step 235000: 0.427613
Minibatch accuracy: 0.0%
Minibatch loss at step 237500: 0.306078
Minibatch accuracy: 0.0%
Minibatch loss at step 240000: 0.289585
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 242500: 0.288693
Minibatch accuracy: 0.0%
Minibatch loss at step 245000: 0.268311
Minibatch accuracy: 0.0%
Minibatch loss at step 247500: 0.261761
Minibatch accuracy: 0.0%
Minibatch loss at step 250000: 0.248120
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 252500: 0.234750
Minibatch accuracy: 0.0%
Minibatch loss at step 255000: 0.234668
Minibatch accuracy: 0.0%
Minibatch loss at step 257500: 0.234770
Minibatch accuracy: 0.0%
Minibatch loss at step 260000: 0.236546
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 262500: 0.252712
Minibatch accuracy: 0.0%
Minibatch loss at step 265000: 0.241282
Minibatch accuracy: 0.0%
Minibatch loss at step 267500: 0.236775
Minibatch accuracy: 0.0%
Minibatch loss at step 270000: 0.255327
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 272500: 0.228942
Minibatch accuracy: 0.0%
Minibatch loss at step 275000: 0.232794
Minibatch accuracy: 0.0%
Minibatch loss at step 277500: 0.251111
Minibatch accuracy: 0.0%
Minibatch loss at step 280000: 0.215594
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 282500: 0.233860
Minibatch accuracy: 0.0%
Minibatch loss at step 285000: 0.233055
Minibatch accuracy: 0.0%
Minibatch loss at step 287500: 0.248072
Minibatch accuracy: 0.0%
Minibatch loss at step 290000: 0.250713
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Minibatch loss at step 292500: 0.241651
Minibatch accuracy: 0.0%
Minibatch loss at step 295000: 0.247429
Minibatch accuracy: 0.0%
Minibatch loss at step 297500: 0.231702
Minibatch accuracy: 0.0%
Minibatch loss at step 300000: 0.233741
Minibatch accuracy: 0.0%
Validation accuracy: 0.0%
Test accuracy: 0.0%
Model saved to ./tensorflowcheckpoints/bbs4

Question 10

How well does your model localize numbers on the testing set from the realistic dataset? Do your classification results change at all with localization included?

Answer:

The classifier was trained on the single digits, and the bounding box regressor was trained on the multi-digit set, so I can't test the classification abilities of the model with the bounding boxes. However, since the classifier and regressor work separately, I would assume the bounding boxes do not affect the accuracy.

Question 11

Test the localization function on the images you captured in Step 3. Does the model accurately calculate a bounding box for the numbers in the images you found? If you did not use a graphical interface, you may need to investigate the bounding boxes by hand. Provide an example of the localization created on a captured image.

In [124]:
from IPython.display import Image as iImage, display

picfolder_path = "./Pics"
pic_paths = os.listdir(picfolder_path)

resized = []

for i in range(5):
    img = Image.open(os.path.join(picfolder_path, pic_paths[i]))
    resized.extend([img.resize((64, 64),Image.ANTIALIAS)])
    resized[i] = np.array(resized[i])

resized = np.array(resized)

graph = tf.Graph()

with graph.as_default():
    # Input
    tf_test_dataset = tf.placeholder(tf.float32, shape=[5,64,64,3])
    
    # Variables
    weight_layer1 = tf.get_variable("ConvW1", shape=[patch_size, patch_size, 
        num_channels, depth1], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer1 = tf.Variable(tf.constant(1.0, shape=[depth1]))
    
    weight_layer2 = tf.get_variable("ConvW2", shape=[patch_size, patch_size, 
        depth1, depth2], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer2 = tf.Variable(tf.constant(1.0, shape=[depth2]))
    
    weight_layer3 = tf.get_variable("ConvW3", shape=[patch_size, patch_size, 
        depth2, depth3], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer3 = tf.Variable(tf.constant(1.0, shape=[depth3]))
    
    weight_layer4 = tf.get_variable("ConvW4", shape=[patch_size, patch_size, 
        depth3, depth4], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer4 = tf.Variable(tf.constant(1.0, shape=[depth4]))
    
    weight_layer5 = tf.get_variable("FcW1", shape=[4 * 4 * depth4, num_outputs], initializer=tf.contrib.layers.xavier_initializer())
    bias_layer5 = tf.Variable(tf.constant(1.0, shape=[num_outputs]))
    
    # Model
    def model(data, ):
        conv1 = tf.nn.conv2d(data, weight_layer1, [1,2,2,1], padding='SAME')
        hidden1 = tf.nn.relu(conv1 + bias_layer1)
        
        conv2 = tf.nn.conv2d(hidden1, weight_layer2, [1,2,2,1], padding='SAME')
        hidden2 = tf.nn.relu(conv2 + bias_layer2)
        
        conv3 = tf.nn.conv2d(hidden2, weight_layer3, [1,2,2,1], padding='SAME')
        hidden3 = tf.nn.relu(conv3 + bias_layer3)
        
        conv4 = tf.nn.conv2d(hidden3, weight_layer4, [1,2,2,1], padding='SAME')
        hidden4 = tf.nn.relu(conv4 + bias_layer4)
        
        shape = hidden4.get_shape().as_list()
        reshape = tf.reshape(hidden4, [-1, shape[1]*shape[2]*shape[3]])
        
        return tf.nn.relu(tf.matmul(reshape, weight_layer5) + bias_layer5)
    
    # Training computation
    logits = model(tf_test_dataset)
    test_prediction = tf.nn.softmax(logits)
    
    saver = tf.train.Saver()

with tf.Session(graph=graph) as session:
    saver.restore(session, tf.train.latest_checkpoint('./tensorflowcheckpoints/'))
    test_prediction = session.run(test_prediction, feed_dict={tf_test_dataset : resized})
    
    for i,img in enumerate(resized):
        plt.imshow(img)
        for j in range(5):
        
            left = test_prediction[i][4*j + 0] * 128
            top = test_prediction[i][4*j + 1] * 128
            width = test_prediction[i][4*j + 2] * 128
            height = test_prediction[i][4*j + 3] * 128
    
            plt.plot([left, left], [top, top + height], 'c')
            plt.plot([left, left+width], [top, top], 'c')
            plt.plot([left+width, left+width], [top, top+height], 'c')
            plt.plot([left, left+width], [top+height, top+height], 'c')
            
        plt.show()

Answer:

The model seems to work fairly well on images captured by my own camera.